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document the progress and development of this important research area. 
Also during this conference, one could follow important research of well 
established areas, but also become acquainted with new research fields 
in optimization. 

The conference was supported by the International Federation for In- 
formation Processing (IFIP), in particular through the Technical Com- 
mittee 7, which selected the conference site, and the program committee, 
which put together an interesting program. Their support and help is 
greatly appreciated. Financial and technical support for this meeting 
came from the host institution, the University of Trier, and the govern- 
ment of the home state, Rheinland-Pfalz. Furthermore, the Deutsche 
Forschungsgemeinschaft (DFG), Siemens AG, and GeneralColgone Re 
Capital generously supported this conference. 

Many committees, institutions and individuals contributed to the suc- 
cess of this conference. We thank the program committee of TC7, in 
particular P. Kail (chairman of the TC7), and the administration of the 
University of Trier. The organization of the conference would not have 
been possible without the help and support of many individuals: Among 
those were H. Beewen, F. Leibfritz, J. Maruhn, U. Morbach, M. Pick, 
M. Ries, M. Schulze, C. Schwarz, T. Voetmann. We also appreciate the 
valuable assistance of the publisher, in particular Y. Lambert, in the 
preparation of the proceedings. 

Trier, May 2003 

E. W. Sachs and R. Tichatschke 
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Abstract We present efficiently verifiable sufficient conditions for the validity of 
specific NP-hard semi-infinite systems of semidefinite and conic qua- 
dratic constraints arising in the framework of Robust Convex Program- 
ming and demonstrate that these conditions are “tight” up to an ab- 
solute constant factor. We discuss applications in Control on the con- 
struction of a quadratic Lyapunov function for linear dynamic system 
under interval uncertainty. 



1. Introduction 

The subject of this paper are “tractable approximations” of intractable 
semi-infinite convex optimization programs arising as robust counter- 
parts of uncertain conic quadratic and semidefinite problems. We start 
with specifying the relevant notions. Let K be a cone in (closed, 
pointed, convex and with a nonempty interior). A conic program asso- 
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ciated with K is an optimization program of the form 

nain{/^a; | Ax — be K}; (CP) 

here x G R^. An uncertain conic problem is a family 

|nnn|/^a; | Ax — be k| |(/, A, &) G (UCP) 

of conic problems with common K and data (/, A, B) running through 
a given uncertainty set U. In fact, we always can get rid of uncertainty 
in / and assume that / is “certain”, i.e., common for all data from U)] 
indeed, we always can rewrite the problems of the family as 

min < t 

t,x [ 

Thus, we lose nothing (and gain a lot, as far as notation is concerned) 
when assuming from now on that / is certain, so that n, K, / form the 
common “structure” of problems from the family, while A, b are the data 
of particular problems ( “instances” ) from the family. 

The Robust Optimization methodology developed in [1, 2, 3, 5, 8, 9] 
associates with (UCP) its Robust Counterpart (RC) 

min | Ax — 6 G K V(c, A, b) G . (R) 

Feasible/optimal solutions of (R) are called robust feasible, resp., robust 
optimal solutions of the uncertain problem (UCP); the importance of 
these solutions is motivated and illustrated in [1, 2, 3, 5, 8, 9]. 

Accepting the concept of robust feasibile/optimal solutions, the cru- 
cial question is how to build these solutions. Note that (R) is a semi- 
infinite conic program and as such can be computationally intractable. 
In this respect, there are “good cases”, where the RC is equivalent to 
an explicit computationally tractable convex optimization program, as 
well as “bad cases”, where the RC is NP-hard (see [3, 5] for “generic 
examples” of both types). In “bad cases”, the Robust Optimization 
methodology recommends replacing the computationally intractable ro- 
bust counterpart by its tractable approximation. An approximate robust 
counterpart of (UCP) is a conic problem 

min ^f^x I Px + Qu + r G k| (AR) 

such that the projection X ( AR) of the feasible set of (AR) onto the plane 
of X- variables is contained in the feasible set of (R); thus, (AR) is “more 



Ax — b 
t - f'^x J 



€ K = K X 



R+l 




Approximate Robust Counterparts of Uncertain SDP and CQP 



3 



conservative” than (R). An immediate question is how to measure the 
“conservativeness” of (AR), with the ultimate goal to use a “moderately 
conservative” computationally tractable approximate RCs instead of the 
“true” (intractable) RCs. A natural way to measure the quality of an 
approximate RC is as follows. Assume that the uncertainty set U is of 
the form 

U = {{A,b) = {A’^,b’^) + V}, 

where 6^) is the “nominal data” and V is the perturbation set which 
we assume from now on to be a convex compact set symmetric w.r.t. the 
origin. Under our assumptions, (UCP) can be treated as a member of 
the parametric family 

|mjn I Ax -b e k| : (A, b) e Up = {(A, b) = (A^, b^) + pV}| 

(UCP,) 

of uncertain conic problems, where p > 0 can be viewed as the “level 
of uncertainty”. Observing that the robust feasible set Xp of (UCP,) 
shrinks as p increases and that (AR) is an approximation of (R) if and 
only if A(AR) C A'l, a natural way to measure the quality of (AR) is to 
look at the quantity 

p(AR:R) = inf{p > 1 : A(AR) D A,}, 

which we call the conservativeness of the approximation (AR) of (R). 
Thus, the fact that (AR) is an approximation of (R) with the conserva- 
tiveness < a means that 

(i) If X can be extended to a feasible solution of (AR), then x is a 
robust feasible solution of (UCP); 

(ii) If X cannot be extended to a feasible solution of (AR), then x is 
not robust feasible for the uncertain problem (UCPa) obtained 
from (UCP)=(UCPi) by increasing the level of uncertainty by the 
factor a. 

Note that in real-world applications the level of uncertainty normally is 
known “up to a factor of order of 1”; thus, we have basically the same 
reasons to use the “true” robust counterpart as to use its approximation 
with p(AR:R) of order of 1. 

The goal of this paper is to overview recent results on tractable 
approximate robust counterparts with “0(l)-conservativeness”, specif- 
ically, the results on semidefinite problem affected by box uncertainty 
and on conic quadratic problem affected by ellipsoidal uncertainty. We 
present the approximation schemes, discuss their quality, illustrate the 
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results by some applications (specifically, in Lyapunov Stability Anal- 
ysis/ Synthesis for uncertain linear dynamic systems with interval un- 
certainty) and establish links of some of the results with recent devel- 
opments in the area of semidefinite relaxations of difficult optimization 
problems. 



2. Uncertain SDP with box uncertainty 



Let be space of real symmetric mxm matrices and S!p be the cone 

n 

of positive semidefinite mxm matrices, and let A^[x] = ^ xjA^^ : 

j^n ^ gm affine mappings, £ = 0, ..., L. Let also C[x] be a symmetric 
matrix affinely depending on x. Consider the uncertain semidefinite 
program 




c[x] y 0 
A[x] y 0 



3(ti ;|| u ||oo< p) : A[x] = A^[x] + '^uiA^[x][ 

E=i ) 



(USDM) 



here an in what follows, for A^B E the relation Ay B means that 
A — B E S!J!. Note that (USD[p]) is the general form of an uncertain 
semidefinite program affected by “box” uncertainty (one where the un- 
certainty set is an affine image of a multi-dimensional cube). Note also 
that the Linear Matrix Inequality (LMI) C[x] y 0 represents the part of 
the constraints which are “certain” - not affected by the uncertainty. 

The robust counterpart of (USD[p]) is the semi-infinite semidefinite 
program 



mm \ f^x 



c[x] y 0 

A0[x] + E uiA^[x] y 0 

1=1 



V(?/ :|| u ||oo< p) 



(RW) 



It is known (see, e.g., [11]) that in general (R[p]) is NP-hard; this is so 
already for the associated analysis problem “Given x^ check whether it 
is feasible for (R[pj)”, and even in the case when all the “edge matrices” 
A^[x]^ i = 1,...,L, are of rank 2. At the same time, we can easily 
point out a tractable approximation of (R[p]), namely, the semidefinite 
program 



min < 



fx: 



c[x] y 0 

X^y±A^[x],i=l,...,L, 

-pEx^yo, 

£=1 J 



(ARM) 



This indeed is an approximation - the a;-component of a feasible solution 
to (AR[pj) clearly is feasible for (R[pj). Surprisingly enough, this fairly 
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simplistic approximation turns out to be pretty tight, provided that the 
edge matrices A^[x]^ I > 1, are of small rank: 

Theorem 1 [6] Let Aq, be mxm symmetric matrices. Consider 

the following two predicates: 



(/[p]) : ^0 + E ^0 V(u :|| n ||oo< p); 

l=i 

{II[p\) : 3Xi, y ^ = 1, p E Xe ^ Ao; 

l-\ 



( 1 ) 



here p>Q is a parameter. 

Then 

(i) If {II[p]) is valid, so is (I[p]); 

(ii) If {II[p]) is not valid, so is {I[d{p)p]), where 

p = Rank(A^) 



(note 1 < i in the maxj and ‘d{p) is a universal function of p given by 
1 



d{k) 



</ 




(27T)-^/2exp| 


\ 

k 

= 1 ^ 




i=l 


[ ^ J 


i=l 

j 



( 2 ) 



Note that 

i?(l) = 1,^(2) = I « 1.57..., i9(3) = 1.73..., i?(4) = 2; ^ 



Vp. 

(3) 



Corollary 2.1 Consider the robust counterpart (R[p]) of an uncertain 
SDP affected by interval uncertainty along with the approximated robust 
counterpart (AR[p]) of the problem, and let 

p = max maxRank(.4^[a;]) 

l<i<L X V L J/ 

(note \ < I in the maxj. Then (AR[p]) is at most d{p)- conservative 
approximation of (R[p]); where d is given by (2). In particular, 

■ The suprema p^ and p of those p > 0 for which (R[p]); respectively, 
(AR[p]) is feasible, are linked by the relation 



P<p"< 
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■ The optimal values f{p) of (R[p]), respectively, (AR[p]) are 

linked by the relation 

f*{p) < fip) < f*{'&{p)p), P > 0. 

The essence of the matter is that the quality of the approximate robust 
counterpart as stated by Corollary 2.1 depends solely on the ranks of 
the “basic perturbation matrices” A^[x], i > 1, and is independent of 
any other sizes of the problem. Fortunately, there are many applications 
where the ranks of the basic perturbations are small, so that the quality 
of the approximation is not too bad. As an important example, consider 
the Lyapunov Stability Analysis problem. 

Lyapunov Stability Analysis. Consider an uncertain time-varying 
linear dynamic system 

z{t) = A{t)z{t) (4) 

where all we know about the matrix A{t) of the system is that it is a 
measurable function of t taking values in a given compact set U which, 
for the sake of simplicity, is assumed to be an interval matrix: 

A{t)eU = Up = {AeK^^^:\Aij-Aij\<pBij,iJ = l,...,n}’ (5) 

here A corresponds to the “nominal” time-invariant system, and D is a 
given “scale matrix” with nonnegative entries. 

In applications, the very first question about (4) is whether the system 
is stable, i.e., whether it is true that whatever is a measurable function 
A(-) taking values in Up, every trajectory z(t) of (4) converges to 0 as 
t oo. The standard sufficient condition for the stability of (4) - (5) is 
the existence of a common quadratic Lyapunov stability certificate for 
all matrices A e Up, i.e., the existence of an n x n matrix X 0 such 
that 

A^X + XA^O ^AeUp. 

Indeed, if such a certificate X exists, then 

A^X + XA:< -aX 

for certain a > 0 and all A e Up. As an immediate computation demon- 
strates, the latter inequality implies that 

(t)Xz{t)) < —az'^ {t)Xz{t) 

for all t and all trajectories of (4), provided that A{t) G Up for all t. 
The resulting differential inequality, in turn, implies that z'^{t)Xz{t) < 
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exp{—at}{z^{ 0 )Xz{ 0 )) — > 0 as t — )► +oc; since X >- 0, it follows that 
z{t) — )► 0 as t — > oo. 

Note that by homogeneity reasons a stability certificate, if it exists, 
always can be normalized by the requirements 

(6) A^X + XA ^ -I \fAeUp. 

Thus, whenever (L[p]) is solvable, we can be sure that system (4) - 
(5) is stable. Although this condition is in general only sufficient and 
not necessary for stability (it is necessary only when p = 0, i.e., in the 
case of a certain time-invariant system), it is commonly used to certify 
stability. This, however, is not always easy to check the condition itself, 
since (L[p]) is a semi-infinite system of LMIs. Of course, since the LMI 
(L[p].b) is linear in A, this semi-infinite system of LMIs is equivalent to 
the usual - finite - system of LMIs 

(a) X y I, 

{b) A^X + XAy ^ -I V^-l,...,2^, 

where N is the number of uncertain entries in the matrix A (i.e., the 
number of pairs ij such that 7^ 0) and Ai, ..., A2iv are the vertices 
of Up. However, the size of (6) is not polynomial in n, except for the 
(unrealistic) case when N is once for ever fixed or logarithmically slow 
grows with n. In general, it is NP-hard to check whether (6), or, which 
is the same, (L[p]) is feasible [11]. 

Now note that with the interval uncertainty (5), the troublemaking 
semi-infinite LMI (L[p].b) is nothing but the robust counterpart 

n 

I - A^X - XA] + ^ UijTtij [ejiXeif + (Xei)ej] ^ 0 (7) 

i,3=^ 

y{u = {Uij} : -p < Uij < p) 



of the uncertain LMI 



[{a^X + XA a -/} -.AeUp]-, 



here Ci are the standard basic orths in R^. Consequently, we can ap- 
proximate (L[pj) with a tractable system of LMIs 



xy I 



(61) X^i y ± D,,- [ejiXeif + (Xe^ej] , V(i, j : Bij > 0) 

V ' 

A»IX] 

(62) [/ - A^x - xaI -p z y 0 

V Zy zj:Dij>0 



(AL[p]) 



A°[X] 
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in matrix variables X, 

Invoking Corollary 2.1, we see that the relations between (AL[p]) and 
(L[p]) are as follows: 

1 Whenever X can be extended to a feasible solution of the system 
(AL[p]), X is feasible for (L[p]) and thus certifies the stability of the 
uncertain dynamic system (4), the uncertainty set for the system 
being Up] 

2 If X cannot be extended to a feasible solution of the system (AL[p]) , 
then X does not solve the system (L[|/>]) (note that the ranks of 
the basic perturbation matrices A^^[X] are at most 2, and t9(2) 

2b 

It follows that the supremum p of those p > 0 for which (AL[p]) is 
solvable is a lower bound for the Lyapunov stability radius of uncertain 
system (4), i.e., for the supremum of those p > 0 for which all matrices 
from Up share a common Lyapunov stability certificate, and that this 
lower bound is tight up to factor 



provided, of course, that A is stable (or, which is the same, that p'^ > 0). 

Note that the bound p on the Lyapunov stability radius is efficiently 
computable; this is the optimal value in the Generalized Eigenvalue 
Problem of maximizing p in variables p, X, {X^-^ } under the constraints 
(LA[p]). 

We have considered a specific application of Theorem 1 in Control. 
There are many other applications of this theorem to systems of LMIs 
arising in Control and affected by an interval data uncertainty. Usually 
the structure of such a system ensures that when perturbing a single 
data entry, the right hand side of every LMI is perturbed by a matrix of 
a small rank, which is the favourable case for our approximation scheme. 

Simplifying the approximation. A severe computational short- 
coming of the approximation (AR[p]) is that its sizes, although polyno- 
mial in the sizes of the approximated system (R[p]) and the uncertainty 
set, are pretty large, since the approximation has an additional m x m 
matrix variable X^ and two m x m LMIs Xi y ±A^[x] per every one 
of the basic perturbations. It turns out that under favourable circum- 
stances the sizes of the approximation can be reduced dramatically. This 
size reduction is based upon the following two facts: 
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Lemma 2.1 [6] (i) Let a ^ 0,6 6e two vectors. A matrix X satisfies the 
relation 

X y ±[a6^ + ha^] 

if and only if there exists A > 0 such that 

X y XaoF + y^6^ 

A 

(when A = 0, the left hand side, by definition, is the zero matrix when 
6 = 0 and is undefined otherwise). 

(ii) Let S be a symmetric m x m matrix of rank k > Q, so that S — 
P^RP with a nonsingular k x k matrix R and k x m matrix P of rank 

k. 

A matrix X satisfies the relation 

xy±s 

if and only if there exists a k x k matrix Y such that Y y ±i? and 

X y p^YP. 

The simplification of (AR[p]), based on Lemma 2.1, is as follows. Let 
us partition the basic perturbation matrices A^[x] into two groups: those 
with A^[x] actually depending on x, and those with A^[x] independent 
of X. Assume that 

(A) The basic perturbation matrices depending on x, let them be 
A^[x],...,A^[x], are of the form 

A^[x] = aibj[x] + bi[x]aj , i = 1, ..., M, (8) 

where ai, bi[x] are vectors and b^[x] are affine in x. 

Note that the assumption holds true, e.g., in the case of the Lyapunov 
Stability Analysis problem under interval uncertainty, see (AL[p]). 

The basic perturbation matrices A^ with i > M are independent of 
X, and we can represent these matrices as 

A^ = PlBePe, (9) 

where Bi are nonsingular symmetric x k(, matrices and k^ = Rank(v4^). 

Observe that when speaking about the approximate robust counter- 
part (AR[p]), we are not interested at all in the additional matrix vari- 
ables Xi] all which matters is the projection of the feasible set of (AR[p]) 
on the plane of the original design variables x. In other words, as far as 
the approximating properties are concerned, we lose nothing when re- 
placing the constraints in (AR[p]) with any other system S of constraints 
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in variables x and, perhaps, additional variables, provided that the pro- 
jection of the feasible set of the new system on the plane of rr- variables 
is the same as the one for (AR[p]). Invoking Lemma 2.1, we see that the 
latter property is possessed by the system of constraints 



(a) 

(b) 

(c) 

(d) A^[x]-p 



c[x] y 0 

A^>0, ^ = 

Y, + X^'^h([x]b^[x^ + Y PjYtPi 

c=i '■ ■' t=M+l 



^0, 



( 10 ) 

in variables x, {A^}, {Y^}. By the Schur Complement Lemma, (10) is 
equivalent to the system of LMIs 



(a) 

ib) 



(c) 



C[x] y 0 

y i = M + 1, L, 

M I 



M 

i=i 

-P E PlYePi 


bi[x] 62 N . 


• • bM[x] 


b{[x\ 


Ai 




q[x] 


A2 








Am 



( 11 ) 



in variables x, {A^ G E Consequently, (AR[p]) 

is equivalent to the semidefinite program of minimizing the objective 
f^x under the constraints (11). Note that the resulting problem (A[p]) 
is typically much better suited for numerical processing than (AR[p]). 
Indeed, the first M of the m x m matrix variables arising in the 
original problem are now replaced with M scalar variables A^, while the 
remaining L — M of X^s are replaced with ki x matrix variables 
normally, the ranks k^ of the basic perturbation matrices are much 
smaller than the sizes m of these matrices, so that this transformation 
reduces quite significantly the design dimension of the problem. As 
about LMIs, the 2L “large” (of the size m x m) LMIs X^ y :fzA^[x] 
of the original problem are now replaced with 2{L — M) “small” (of 
the sizes k^ x k^) LMIs (11. &) and a single “very large” - of the size 
(m + M) X (m + M) - LMI (11. c). Note that the latter LMI, although 
large, is of a very simple “arrow” structure and is extremely sparse. 



Links with quadratic maximization over the unit cube. It 

turns out that Theorem 1 has direct links with the problem of maximiz- 
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ing a positive definite quadratic form over the unit cube. The link is 
given by the following simple observation: 

Proposition 2.1 Let Q be a positive definite m x m matrix. Then the 
reciprocal p{Q) of the quantity 

oo{Q) = ma.x{x^ Qx :|| x ||oo^ 1} 

equals to the maximum of those p > 0 for which all matrices from the 
matrix box 

Cp = Q ^ + {A = : \Aij\ < p} 

are positive semidefinite. 

Proof. (jo{Q) is the minimum of those u for which the ellipsoid {x : 
x^Qx < uo] contains the unit cube, or, which is the same, the minimum 
of those uj for which the polar of the ellipsoid (which is the ellipsoid 
is contained in the polar of the cube (which is the 
set :|| ^ ||i< 1}). In other words, 

p{Q) = = max > P II ^ 111 • 

Observing that by evident reasons 

II ^ lli= : A = |^j| < 1, i,j = , 

we conclude that 

p{Q) = max{p : Q~^ y pA V(A = A^ : \Aij\ < 1, ij = 1, 



as claimed. ■ 

Since the “edge matrices” of the matrix box Cp are of ranks 1 or 2 

{ T 

e ' e • % 7 

* r’ , r • ^ • 7 
ejgj + ejej , ^ < j ’ 

1 < * < J < m, where are the standard basic orths), Theorem 1 says 
that the efficiently computable quantity 



p = sup 



p : 3{X^^} : 



X^3 y l<i<j<m 

Q-^-p E h 0 






is a lower bound, tight within the factor d{2) = for the quantity 
p((5), and consequently the quantity u){Q) = is an upper bound, 
tight within the same factor |, for the maximum oo{Q) of the quadratic 
form x^Qx over the unit cube: 

u{Q) < Q{Q) < ^u){Q). 



( 12 ) 
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On a closest inspection (see [6]), it turns out that u){Q) is nothing but 
the standard semidefinite bound 

cu{Q) — max {Tr((5X) : X ^ 0, Xu < 1, z = 1, m} 

— min Ai : Diag{A} ^ Q 

on (jj{Q). The fact that bound (13) satisfies (12) was originally es- 
tablished by Yu. Nesterov [13] via a completely different construction 
which heavily exploits the famous MAXCUT-related “random hyper- 
plane” technique of Goemans and Williamson [10]. Surprisingly, the 
re-derivation of (12) we have presented, although uses randomization 
arguments, seems to have nothing in common with the random hyper- 
plane technique. 

Theorem 1: Sketch of the proof.. We believe that not only the 
statement, but also the proof of Theorem 1 is rather instructive, this is 
why we sketch the proof here. We intend to focus on the most nontrivial 
part of the Theorem, which is the claim that when {II[p]) is not valid, 
so is {I[&{p)p]) (as about the remaining statements of Theorem 1, note 
that (i) is evident, while the claim that the function (2) satisfies (3) is 
more or less straightforward). Thus, assume that {H[p]) is not valid; we 
should prove that then {I[d{p)p]) also is not valid, where ‘d{-) is given 
by (2). 

The fact that (II[p]) is not valid means exactly that the optimal value 
in the semidefinite program 




min It : Xi y £ — 1, ..., T; p^^^i + tl > 

t,{Xe} [ J 

is positive. Applying semidefinite duality (which is a completely me- 
chanical process) we, after simple manipulations, conclude that in this 
case there exists a matrix W ^ 0, Tr(W) == 1, such that 

L 

Y, II ||i> (14) 

where X{B) is the vector of eigenvalues of a symmetric matrix B. Now 
observe that if S is a symmetric mxm matrix and ^ is an m-dimensional 
Gaussian random vector with zero mean and unit covariance matrix, 
then 



II m 111 . 



(15) 
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Indeed, in the case when B is diagonal, this relation is a direct conse- 
quence of the definition of '??(•); the general case can be reduced imme- 
diately to the case of diagonal B due to the rotational invariance of the 
distribution of 

Since the matrices are of ranks not exceeding == 

^m^Rank(A^), (14) combines with (15) to imply that 



L 

It follows that there exists ( = such that 

setting ei = sign((’^74^0, we can rewrite the latter relation as 






^{p'd{p)€^) Al 



c > C^AC, 



and we see that the matrix Ao — u^Ai is not positive definite, while 



by construction \u^\ < Thus, (7['i9(/i)p]) indeed is not valid. 



3. Approximate robust counterparts of 
uncertain convex quadratic problems 

Recall that a generic convex quadratically constrained problem is 

nun : x^AjAiX < 2bJ x + i = 1, ..., m| (QP) 

here x G R^, and Ai are rrii x n matrices. The data of an instance 
is (c, {A^, 6^, Ci}^i). When speaking about uncertain problems of this 
type, we may focus on the robust counterpart of the system of constraints 
(since we have agreed to treat the objective as certain). In fact we can 
restrict ourselves with building an (approximate) robust counterpart of 
a single convex quadratic constraint, since the robust counterpart is a 
“constraint-wise” construction. Thus, we intend to focus on building an 
approximate robust counterpart of a single constraint 

x^A^Ax < 2b^x + c 



(16) 
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with the data (A^b^c). We assume that the uncertainty set is “parame- 
terized” by a vector of perturbations: 

U = Up=^{A, b, c) = (A", 6", c") + c^) : C € pvj , (17) 

here V is a convex compact symmetric w.r.t. the origin set in (“the 
set of standard perturbations”) and p > 0 is the “perturbation level”. 

In what follows, we shall focus on the case when V is given as an 
intersection of ellipsoids centered at the origin: 

V = {c e I C^QiC < 1, * = 1, k} , (18) 

k 

where Qz ^ 0 and Qi y We will be interested also in two particular 

i=l 

cases of general ellipsoidal uncertainty (18), namely, in the cases of 

• simple ellipsoidal uncertainty k = 1] 

• box uncertainty: k = L and ^ ■••5 L. 

Note that the ellipsoidal robust counterpart of an uncertain quadratic 
constraint affected by uncertainty (18) is, in general, NP-hard. In- 
deed, already in the case of box uncertainty to verify robust feasibility 
of a given candidate solution is not easier than to maximize a convex 
quadratic form over the unit cube, which is known to be NP-hard. Thus, 
all we can hope for in the case of uncertainty (18) is a “computation- 
ally tractable” approximate robust counterpart with a moderate level of 
conservativeness, and we are about to build such a counterpart. 



3.1 Building the robust counterpart of (16) — (18) 

We build an approximate robust counterpart via the standard semidef- 
inite relaxation scheme. For x G R^, let 

/ x^b^ \ 

a[x] = A^x^ A[x] — p ^A^x^ A^x ^ ..., A^x , b[x] = p 



(c^ 



\x^b^ ) 



(19) 



d - f 






: I , e[x] = 2x^b^ + c^, 

L 



SO that for all ( one has 



tT r 






A- + pJ2CcA^ 



X = 



= {a[x] + ^[s]C)^ (a[®] + ^NC) , 
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2x^ 






+ 






2{b[x] +d)^ C + e[x\. 



Prom these relations it follows that 



(а) X is robust feasible for (16) - (18) 

t 

{a[x] + A[x]Ci^ (a[a;] + A[a;]() < 2 (6[a;] + ( + e[x] 

y{C:eQiC<l,i = l,-,k) 

t 

A^[x]A[x\Q + [^[a;]a[a;] — b[x] — d] < e[x] — a^[x]a[x] 

v(C:C^ftC<i,* = i,-,^) 

A^[x\A[x\(^ + 2t(^ [yl[a;]a[rc] — b[x] — d\ < e[x] — a^[x]a[x] 
V(C, t : CQiC < 1, i = 1, k, = 1) 

t 

(б) Al^[x]A[x](^ + 2t(^^ [j4[a;]a[3;] — 6[a;] — d\ < e[x\ — o^[a;]o[a;] 

V(C,i:C^QK<i,* = l,-,A;,^2<i). 

(20) 

Thus, X is robust feasible if and only if the quadratic inequality (20.6) 
in variables is a consequence of the system of quadratic inequalities 
^ < 1. An evident sufficient condition for (20.6) 

to hold true is the possibility to majorate the left hand side of (20.6) 

for all by a sum "^XiC'^QiC + with nonnegative weights A^,/i 

i 

satisfying the relation + /i < e[x] — a^[x]a[x]. Thus, we come to 

i 

the implication 



^ \i + p < e[x] - a^[x]a[x], 

(а) 3{p > 0, {A, > 0}) : ^ 

+2t('^ [^[a;]a[a;] — 6[a;] — d] V(C, t) 

CA^[x]A[x]C + 2tC^ [A[x]a[x] - 6[a;] - d] < e[x] - a^[x]a[x] 

y{(:,t:C'^QiC<l,i = l,...,k,t^ < 1) 

(б) rr is robust feasible 



( 21 ) 
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A routine processing of condition (21. a) which we skip here demonstrates 
that the condition is equivalent to the solvability of the system of LMIs 






e[x] - E Aj [-6[rr] - df dd[x] 

1=1 

[-h[x] -d] E XiQi -A^x] 

i=l 

a[x] ^ 

Xi ^ 0, i Ij Ai, 



A 

/ 



bO, 



( 22 ) 



in variables x, A. We arrive at the following 

Proposition 3.1 The system of LMIs (22) is an approximate robust 
counterpart of the uncertain convex quadratic constraint (16) - (18). 



The level of conservativeness O of (22) can be bounded as follows: 



Theorem 2 [7] 

(i) In the case of a general-type ellipsoidal uncertainty (18)^ one has 



n<n = 






3.6 + 2 In 



^Rank(Qi) 

i=l 



(23) 



Note that the right hand side in (23) is < 6, provided that 
k 

^Rank(Qi) < 10,853,519. 

Z=1 

(ii) In the case of box uncertainty: 

C'^QiC = CiA^i^k = L = dim ^ ^ 

(hi) In the case of simple (k = 1) ellipsoidal uncertainty (18); = 1 

(22) is equivalent to the robust counterpart o/ (16) - (18). 



An instrumental role in the proof is played by the following fact which 
seems to be interesting by its own right: 

Theorem 3 [7] Let i ?, - be symmetric n x n matrices such 

that i?i, ..., ^ 0 and there exist nonnegative weights Xi such that 

k 

Yf, XiRi y Consider the optimization program 
2 = 0 

OPT max < 1, 2 = 0, ...,/c| (24) 
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{ k k \ 

^ Mi : ^ Mi^i h R, M > 0 r • 

i=o i=o J 



(25) 



Then (25) is solvable, its optimal value majorates the one of (24)^ and 
there exists a vector such that 

y*Ry* = SDP; yjRoy* < 1; y^Riy* < i = i, k, 



n = i 



\ 



3.6 + 21n ( ^ Rank(<3j)j, 



In particular, 



81n2 + 41nn + 21n ( ^ Rank(Qi) j , 



Rq = q^q is dyadic 



otherwise 



(26) 



OPT < SDP < • OPT. 



4. Approximate robust counterparts of 
uncertain conic quadratic problems 

The constructions and results of the previous section can be extended 
to the essentially more general case of conic quadratic problems. Recall 
that a generic conic quadratic problem (another name: SOCP - Second 
Order Cone Problem) is 

min|/^a; :|| AiX + bi || 2 < aJx + jSi, i = (CQP) 



here x E R^, and Ai are mi x n matrices; the data of (CQP) is the col- 
lection (f,{Ai,bi,ai,l3i}'^i). As always, we assume the objective to be 
“certain” and thus may restrict ourselves with building an approximate 
robust counterpart of a single conic quadratic constraint 

II Ax + b || 2 < a^x + 13 (27) 



with data (A, b, a, /3). 

We assume that the uncertainty set is parameterized by a vector of 
perturbations and that the uncertainty is ^‘side-wise^^: the perturbations 
affecting the left- and the right hand side data of (27) run independently 
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of each other through the respective uncertainty sets: 
U = X 






{A,b) = {A-,h-)+Y.U{A^.h^) 

t=\ 

R 



^ ^ ^ ^ nr{a\l3^) 

r=l 



C e , 

r] € pV"g*** 



(28) 



In what follows, we focus on the case when is given as an inter- 
section of ellipsoids centered at the origin: 

V^®f‘ = {c e I C^QiC < 1 , * = 1 , k} , ( 29 ) 

k 

where Qi ^ 0 and X) Qi ^ 0. We will be interested also in two particular 

i=l 

cases of general ellipsoidal uncertainty (29), namely, in the cases of 

• simple ellipsoidal uncertainty k = 1; 

• box uncertainty: k = L and C'^QiC = ^ = 1? •••5 L. 

As about the “right hand side” perturbation set we allow for 

a much more general geometry, namely, we assume only that is 

bounded, contains 0 and is semideGnite-representable: 

r] G 3u : P{r]) + Q{u) — 5^0, (30) 

where P(? 7 ), Q{u) are symmetric matrices linearly depending on rj^u^ 
respectively. We assume also that the LMI in (30) is strictly feasible, 
i.e., that 

p{v) + Q{u) -SyO 

for appropriately chosen u. 



4.1 Building approximate robust counterpart of 
(27) - (30) 

For X G R^, let 

a[x] = A^x + 6^, A[x] = p \a^x + b^^A^x + 6^, ..., A^x + , (31) 

so that for all ( one has 



a^ + pJ2qa^ 

i 



X + 



i 



= a[x] + v4[rE]C. 



Since the uncertainty is side-wise, x is robust feasible for (27) - (30) if 
and only if there exists r such that the left hand side in (27), evaluated at 
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a;, is at most r for all left-hand-side data from while the right hand 
side, evaluated at x, is at least r for all right-hand-side data from 
The latter condition can be processed straightforwardly via semideGnite 
duality^ with the following result: 

Proposition 4.1 A pair (x^r) is such that r < a^x + P for all (a^/3) G 
fright jg extended to a solution of the system of 

LMIs 



T < x^a^ + + Tr(5F), P*(F) = ^[x] = 



Q*^v) = o, vyo 



p[x'^a^ + /3^] > 
p[x'^a^ + /3^] ) 



(32) 



in variables x^r^V . Here for a linear mapping A{z) = ^ ^kAk • 

i=l 

taking values in the space ofmxm symmetric matrices, A*{Z) = 
/Tr{ZA,)\ 

; : S”* — )■ is the mapping conjugate to ^(-)- 

\tv{ZAi)) 

In view of the above observations, we have 



(a) 



X is robust feasible for (27) - (30) 

t 

. ( (a;, T, y) solves (32) 

^ ’ I II + ^WC l|2< T V(C : CQi( <l,i = l, k) 

t 

. ( (a;, r, F) solves (32) 

’ ■ I II ±«W + 2 IMC ||2< T y{( : CQiC <l,i = 



(b) 



3(r,F): 



3(r,F):{ 



t 

{x, T, V) solves (32) 
ta[x] + 2 l[a:]C || 2 < t 
y{C,t:C'^QiC <l,i = l,...,k,f <1) 

t 

f (1) (x,T,V) solves (32) 

(2) r > 0 

ta[x\ + A[x\C, || 2 < 

V(C,*:C^QiC< < 1) 



( 3 ) 



( 33 ) 
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Thus, X is robust feasible if and only if (33.6) holds true. Now observe 
that 



3(/i, Ai , ..., : 

(a) > 0, Ai > 0, i = 1, ..., k, 

(b) IJ-+ T, ^i< T, 

i=l 

(c) r + E >11 + ^[a;]C Hi V(i,C) 



II ta[x] + A[x](^ || 2 < V(C,i : C^QiC < 1,* = 1, < 1). 

Via Schur complement, for nonnegative the inequality 



T ^ >11 to\x\ + ^[a;]C 



holds true for a given pair (t, () if and only if 



(34) 



0 



C-] 






E ^iQi 



t 

LC 



[ a[x] A[x] ] 

( 



V <^] 



a^[x] 

A^[x] 



tI 



r 










I 



fj- 




a^[x] 




EXiQi 

i=l 


A^[x] 


a[x] 


A[x] 


tI 



T 


e 








I 



In view of this observation combined with the fact that the union, over all 



r, (^, of the image spaces of the matrices 



/- 



V/ 






IS 



the entire we conclude that in the case of nonnegative r, //, {A^} 

the relation (34. c) is equivalent to 



( 



ij- 




a^[a;] ' 




S ^iQi 

i=l 


A'^lx] 


a[x] 


A[x] 


rl 



y 0. 
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Abstract Global convergence to first-order critical points is proved for a variant 
of the trust-region SQP-filter algorithm analyzed in (Fletcher, Gould, 
Leyffer and Toint). This variant allows the use of two types of step 
strategies: the first decomposes the step into its normal and tangential 
components, while the second replaces this decomposition by a stronger 
condition on the associated model decrease. 



1. Introduction 

We analyze an algorithm for solving optimization problems where a 
smooth objective function is to be minimized subject to smooth nonlin- 
ear constraints. No convexity assumption is made. More formally, we 
consider the problem 



minimize f{x) 

subject to cs{x) = 0 (1.1) 

cx{x) > 0, 
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where / is a twice continuously differentiable real valued function of the 
variables x G IR^ and cs{x) and cx{x) are twice continuously differen- 
tiable functions from IR^ into IR^ and from IR^ into IR^, respectively. 
Let c{x)^ = {cs{xY' cx{xY'). 

The class of algorithms that we discuss belongs to the class of trust- 
region methods and, more specifically, to that of filter methods intro- 
duced by Fletcher and Leyffer (1997), in which the use of a penalty 
function, a common feature of the large majority of the algorithms for 
constrained optimization, is replaced by the introduction of a so-called 
“filter”. 

A global convergence theory for an algorithm of this class is proposed 
in Fletcher, Leyffer and Toint (1998), in which the objective function is 
locally approximated by a linear function, leading, at each iteration, to 
the (exact) solution of a linear program. This algorithm therefore mixes 
the use of the filter with sequential linear programming (SLP). Simi- 
lar results are shown in Fletcher, Leyffer and Toint (2000), where the 
approximation of the objective function is quadratic, leading to sequen- 
tial quadratic programming (SQP) methods, but at the price of finding 
a global minimizer of the possibly nonconvex quadratic programming 
subproblem, which is known to be a very difficult task. Convergence 
of SQP filter methods was also considered in Fletcher, Gould, Leyffer 
and Toint (1999), where the SQP step was decomposed in “normal” 
and “tangential” components. Although this latter procedure is compu- 
tationally well-defined and considerably less complex than finding the 
global minimum of a general quadratic program, it may sometimes be 
costly, and a simpler strategy, where the step is computed “as a whole” 
can also be of practical interest whenever possible. The purpose of this 
paper, a companion of Fletcher et al. (1999), is to analyze a hybrid algo- 
rithm that uses the decomposition of the step into normal and tangential 
components as infrequently as possible. 

2. A Hybrid Trust-Region SQP-Filter 
Algorithms 

For the sake of completeness and clarity, we review briefly the main 
constituent parts of the SQP algorithm discussed in Fletcher et al. 
(1999). Sequential quadratic programming methods are iterative. At 
a given iterate Xk-^ they implicitly apply Newton’s method to solve (a lo- 
cal version of) the first-order necessary optimality conditions by solving 




Global Convergence of a Hybrid Trust-Region SQP-Filter Algorithm 25 

the quadratic programming subproblem QF{xk) given by 

minimize fk + {gk, s) + i(s, Hks) 
subject to C£{xk) + As{xk)s = 0 (2-1) 

cx{xk) + Ai{xk)s > 0, 

where fk = f{xk), 9k = g{xk) '= '^xf{xk), where Asixk) and Ax{xk) are 
the Jacobians of the constraint functions cs and cj at Xk and where 
is a symmetric matrix. We will not immediately be concerned about how 
Hk is obtained, but we will return to this point in Section 3. Assuming 
a suitable value of can be found, the solution of QP(xjt) then yields 
a step Sk’ If Sk — 0, then x^ is first-order critical for problem (1.1). 

2.1 The filter 

Unfortunately, due to the locally convergent nature of Newton’s iter- 
ation, the step Sk may not always be very good. Thus, having computed 
(in a so far unspecified manner) a step Sk from our current iterate we 
need to decide whether the trial point Xk + Sk is any better than Xk as 
an approximate solution to our original problem (1.1). This is achieved 
by using the notion of a filter, itself based on that of dominance. 

If we define the feasibility measure 

0(j;) = max 0, max |ci(o;)|, max — Ci(x) , (2.2) 

L i^S i£X J 

we say that a point x\ dominates a point X 2 whenever 
9{xi) < 9 {x2) and f{xi) < f{x 2 ). 

Thus, if iterate Xk dominates iterate Xj^ the latter is of no real interest 
to us since Xk is at least as good as Xj on account of both feasibility and 
optimality. All we need to do now is to remember iterates that are not 
dominated by any other iterates using a structure called a filter. A filter 
is a list T of pairs of the form (0^, fi) such that either 

9i < 9j or fi < fj 

for i ^ j. Fletcher et al. (1999) propose to accept a new trial iterate 
Xk + Sk only if it is not dominated by any other iterate in the filter and 
Xk- In the vocabulary of multi-criteria optimization, this amounts to 
building elements of the efficient frontier associated with the bi-criteria 
problem of reducing infeasibility and the objective function value. We 
may describe this concept by associating with each iterate Xk its (0, /)- 
pair {9k, fk) and accept Xk + Sk only if its (0, /)-pair does not lie, in 
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the two-dimensional space spanned by constraint violation and objective 
function value, above and on the right of a previously accepted pair 
(including that associated with Xk)- 

While the idea of not accepting dominated trial points is simple and 
elegant, it needs to be refined a little in order to provide an efficient 
algorithmic tool. In particular, we do not wish to accept Xk + if its 
(0, /)-pair is arbitrarily close to that of x^ or that of a point already in 
the filter. Thus Fletcher et al. (1999) set a small “margin” around the 
border of the dominated part of the (0, /)-space in which we shall also 
reject trial points. Formally, we say that a point x is acceptable for the 
filter if and only if 

9{x) < (1 -je)0j or f{x) < fj -^eOj for all {9jjj) G JF, (2.3) 

for some je G (0, 1). We also say that x is “acceptable for the filter and 
Xk^ if (2.3) holds with replaced by T U {9k^fk)- We thus consider 
moving from Xk to x^ + Sk only if Xk + Sk is acceptable for the filter and 

Xk. 

As the algorithm progresses, Fletcher et al. (1999) add (0, /)-pairs to 
the filter. If an iterate Xk is acceptable for this is done by adding 
the pair [9k^ fk) fo the filter and by removing from it every other pair 
(0j, fj) such that 9j > 9k and fj — je9j > fk — le^k- We also refer to 
this operation as “adding Xk to the filter” although, strictly speaking, it 
is the (0, /)-pair which is added. 

We conclude this introduction to the notion of a filter by noting that, 
if a point Xk is in the filter or is acceptable for the filter, then any other 
point X such that 

G{x) < (1 - ^e)0k and f{x) < fk~ le&k 
is also be acceptable for the filter and Xk^ 

2,2 The composite SQP step 

Of course, the step Sk must be computed, typically by solving, possibly 
approximately, a variant of (2.1). In the trust-region approach, one 
takes into account the fact that (2.1) only approximates our original 
problem locally: the step Sk is thus restricted in norm to ensure that 
Xk + Sk remains in a trust-region centred at where we believe this 
approximation to be adequate. In other words, we replace QP{xk) by 
the subproblem TRQP{xk^ Ak) given by 

minimize mk{xk + 5 ) 

subject to cs{xk) + As{xk)s = 0, , , 

ci{xk) + Ai{xk)s >0, 

IMII ^ ^ki 



and 
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for some (positive) value of the trust-region radius where we have 
defined 

mk{xk + s) = fk + {9k, s) + ^{s,Hks), (2.5) 

and where || • || denotes the Euclidean norm. This latter choice is purely 
for ease of exposition. We could equally use a family of iteration depen- 
dent norms || • ||/c, so long as we require that all members of the family 
are uniformly equivalent to the Euclidean norm. The interested reader 
may verify that all subsequent developments can be adapted to this 
more general case by introducing the constants implied by this uniform 
equivalence wherever needed. 

Remarkably, most of the existing SQP algorithms assume that an 
exact local solution of QF{xk) or TRQP(xfc, A^) is found, although at- 
tempts have been made by Dembo and Tulowitzki (1983) and Murray 
and Prieto (1995) to design conditions under which an approximate so- 
lution of the subproblem is acceptable. In contrast, the algorithm of 
Fletcher et al. (1999) is in spirit to the composite-step SQP methods pi- 
onneered by Vardi (1985), Byrd, Schnabel and Shultz (1987), and Omo- 
jokun (1989) and later developed by several authors, including Biegler, 
Nocedal and Schmid (1995), El-Alem (1995, 1999), Byrd, Gilbert and 
Nocedal (2000a), Byrd, Hribar and Nocedal (20006), Bielschowsky and 
Gomes (1998), Liu and Yuan (1998) and Lalee, Nocedal and Plantenga 
(1998). It decomposes the step into the sum of two distinct com- 
ponents, a normal step such that Xk + Uk satisfies the constraints 
of TRQP((Ta;, A)t), and a tangential step whose purpose is to obtain 
reduction of the objective function’s model while continuing to satisfy 
those constraints. The step Sk is then called composite. More formally, 
we write 

— '^k “ 1 “ ^k (^'^) 

and assume that 

ce{xk) + As{xk)nk = 0, cx{xk) + Ax{xk)nk > 0, (2.7) 

lkit||< A,, (2.8) 

and 

^s{^k) T A^[x]^)s}^ = 0, cx{xk) “f" Ax{x}^)sj^ ^ 0. (2*9) 

Of course, this is a strong assumption, since in particular (2.7) or (2.8)/ 
(2.9) may not have a solution. We shall return to this possibility shortly. 
Given our assumption, there are many ways to compute and For 
instance, we could compute n^ from 

Uk = Pk[xk] - Xk, 



( 2 . 10 ) 
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where Pk is the orthogonal projector onto the feasible set of QP(o;a:). No 
specific choice for rik is made, but one instead assumes that exists 
when the maximum violation of the nonlinear constraints at the A;-th 
iterate 9^ = 9{xk) is sufficiently small, and that is then reasonably 
scaled with respect to the values of the constraints. In other words, 
Fletcher et al. (1999) assume that 

Uk exists and |ln^|| < whenever 6k < (2.11) 

for some constants > 0 and Sn > 0. This assumption is also used by 
Dennis, El-Alem and Maciel (1997) and Dennis and Vicente (1997) in 
the context of problems with equality constraints only. It can be shown 
not to impose conditions on the constraints or the normal step itself that 
are unduly restrictive (see Fletcher et al. (1999) for a discussion). 

Having defined the normal step, we are in position to use it if it falls 
within the trust-region, that is if ||nfc|| < In this case, we write 

Xh = Xk + nh, ( 2 . 12 ) 

and observe that rik satisfies the constraints of TRQP(rrA;, A)t) and thus 
also of QP(a;/;;). It is crucial to note, at this stage, that such an Uk may 
fail to exist because the constraints of QP{xk) may be incompatible, in 
which case Pk is undefined, or because all feasible points for QP{xk) 
may lie outside the trust region. 

Let us continue to consider the case where this problem does not arise, 
and a normal step rik has been found with \\nk\\ < A)^. We then have 
to find a tangential step starting from x^ and satisfying (2.8) and 
(2.9), with the aim of decreasing the value of the objective function. As 
always in trust-region methods, this is achieved by computing a step 
that produces a sufficient decrease in m/j, which is to say that we wish 
rrik{x^) — rrik{xk + Sk) to be “sufficiently large”. Of course, this is only 
possible if the maximum size of tk is not too small, which is to say 
that x^ is not too close to the trust-region boundary. We formalize this 
condition by replacing our condition that ||n^|| < Ak by the stronger 
requirement that 

llnfcll < «AAfemin[l,/«^A^*^], (2.13) 

for some ka G (0,1], some > 0 and some fik G [0,1). If condition 
(2.13) does not hold, Fletcher et al. (1999) presume that the computation 
of tk is unlikely to produce a satisfactory decrease in rrik^ and proceed 
just as if the feasible set of TRQP(a;^, Ajt) were empty. If rik can be 
computed and (2.13) holds, TRQP(o;a;, A)^) is said to be compatible for 
11 . In this case at least a sufficient model decrease seems possible, in 
the form of a familiar Cauchy-point condition. In order to formalize this 
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notion, we recall that the feasible set of QP(rrA;) is convex, and we can 
therefore introduce the first-order criticality measure 



Xk = \ min {gk + Hknk,t)\ (2.14) 

As{xk)t=0 

ci{xk)+Ax{xk){nk+t)>0 

\\t\\<l 

(see Conn, Gould, Sartenaer and Toint, 1993). Note that this function 
is zero if and only if is a first-order critical point of the linearized 
“tangential” problem 

minimize {gk + Hkrik, t) + ^ {Hkt, t) 
subject to As{xk)t = 0 (2.15) 

Cj{xk) + Ai{xk){nk + i) > 0, 



which is equivalent to QP(rr^) with s = rik + 1. The sufficient decrease 
condition then consists in assuming that there exists a constant /^tmd > 0 
such that 



'mk{xk) - mk{x% + tk) > K.^aXk min 




5 



(2.16) 



whenever TRQP(x^, A^) is compatible, where /3^ = 1 + ||^A;||- We know 
from Toint (1988) and Conn et al. (1993) that this condition holds if the 
model reduction exceeds that which would be obtained at the generalized 
Cauchy point, that is the point resulting from a backtracking curvilinear 
search along the projected gradient path from x^^ that is 



Xk{a) = Pk[xk - aVxrukixk)]. 



This technique therefore provides an implementable algorithm for com- 
puting a step that satisfies (2.16) (see Gould, Hribar and Nocedal, 1998 
for an example in the case where c{x) = cs{x)^ or Toint, 1988 and More 
and Toraldo, 1991 for the case of bound constraints), but, of course, 
reduction of rrik beyond that imposed by (2.16) is often possible and 
desirable if fast convergence is sought. Also note that the minimization 
problem of the right-hand side of (2.14) reduces to a linear program- 
ming problem if we choose to use a polyhedral norm in its definition at 
iteration k. 

If TRQP(rr)t, A)^) is not compatible for /i, that is when the feasible 
set determined by the constraints of QP(rrjt) is empty, or the freedom 
left to reduce rrik within the trust region is too small in the sense that 
(2.13) fails, solving TRQP(rr)^, Aj^) is most likely pointless, and we must 
consider an alternative. Observe that, if 0{xk) is sufficiently small and 
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the true nonlinear constraints are locally compatible, the linearized con- 
straints should also be compatible, since they approximate the nonlin- 
ear constraints (locally) correctly. Furthermore, the feasible region for 
the linearized constraints should then be close enough to Xk for there 
to be some room to reduce at least if is large enough. If the 
nonlinear constraints are locally incompatible, we have to find a neigh- 
bourhood where this is not the case, since the problem (1.1) does not 
make sense in the current one. Fletcher et al. (1999) thus rely on a 
restoration procedure^ whose aim is to produce a new point Xk + r^ for 
which TRQP(x^ + r^, A^^+i) is compatible for some A^+i > 0 — another 
condition will actually be needed, which we will discuss shortly. 

The idea of the restoration procedure is to (approximately) solve 

min 9(x) (2.17) 

starting from Xk^ the current iterate. This is a non-smooth problem, but 
there exist methods, possibly of trust-region type (such as that suggested 
by Yuan, 1994), which can be successfully applied to solve it. Thus we 
will not describe the restoration procedure in detail. Note that we have 
chosen here to reduce the infinity norm of the constraint violation, but 
we could equally well consider other norms, such as or ^2, in which case 
the methods of Fletcher and Leyffer (1998) or of El-Hallabi and Tapia 
(1995) and Dennis, El-Alem and Williamson (1999) can respectively be 
considered. Of course, this technique only guarantees convergence to a 
first-order critical point of the chosen measure of constraint violation, 
which means that, in fact, the restoration procedure may fail as this 
critical point may not be feasible for the constraints of (1.1). However, 
even in this case, the result of the procedure is of interest because it typ- 
ically produces a local minimizer of 0{x)^ or of whatever other measure 
of constraint violation we choose for the restoration, yielding a point of 
locally-least infeasibility. 

There seems to be no easy way to circumvent this drawback, as it is 
known that finding a feasible point or proving that no such point exists is 
a global optimization problem and can be as difficult as the optimization 
problem (1.1) itself. One therefore has to accept two possible outcomes 
of the restoration procedure: either the procedure fails in that it does 
not produce a sequence of iterates converging to feasibility, or a point 
Xk + rk is produced such that 6{xk + rk) is as small as desired. 

2.3 An alternative step 

Is it possible to find a cheaper alternative to computing a normal 
step, finding a generalized Cauchy point and explicitly checking (2.16)? 
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Suppose, for now, that it is possible to compute a point Xk + directly 
to satisfy the constraints of TRQP(a;fc, Aj^) and for which 

mfe(xfc) - mk{xk + Sk) > min[7rfc, A^] (2.18) 

and TTk = where tt is a continuous function of its argument that is 

a criticality measure for TRQP(a:;^, A^). Such a Sk could for instance be 
computed by applying any efficient method to this latter problem (we 
might think of interior point methods of the type described in Conn, 
Gould, Orban and Toint, 2000) for instance, and its performance could 
be assessed by computing 

n{x) = min ||g(a:) - A{xfy\\. 
y\yi>^ 

Of course, nothing guarantees that such an exists (depending on our 
choice of tv{x)) or is cheaply computable for each which means that 
we may have to resort to the normal-tangential strategy of Fletcher et 
al. (1999) if such problems arise. However, if we can find at a fraction 
of the cost of computing rik and can we use it inside an SQP-filter 
algorithm and maintain the desirable convergence to first-order critical 
points? 

Obviously, the answer to that question depends on the manner in 
which the use of 5^; is integrated into a complete algorithm. 

2.4 A hybrid SQP-filter Algorithm 

We have now discussed the main ingredients of the class of algorithms 
we wish to consider, and we are now ready to define it formally as 
Algorithm 2.1: 



Algorithm 2.1: Hybrid SQP-filter Algorithm 

Step 0: Initialization. Let an initial point xq , an initial trust- 
region radius Aq > 0 and an initial symmetric matrix Hq be 
given, as well as constants 70 < 7i < 1 < 725 0 < 771 < 772 < I5 

ye e (0,1), Ke € (0,1), ka € (0,1], > 0, /i e (0,1), 

tp > 1/(1 + n), > 0 and € (0, Ij. Compute f{xo) 

and c(ico)- Set JF = 0 and k = 0. 
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Step 1: Test for optimality. and either 0 or tt^ = 

0, stop. 

Step 2: Alternative step. If 

9k > K^Ak min[l, A^], (2.19) 

set = fi and go to Step 3. Otherwise, attempt to compute 
a step Sk that satisfies the constraints of TRQP{xki ^k) 
(2.18). If this succeeds, go to Step 4. Otherwise, set fik = 0. 

Step 3: Composite step. 

Step 3a: Normal component. Attempt to compute a nor- 
mal step Uk- If TRQP {xk^ A^) is compatible for go to 
Step 3b. Otherwise, include Xk in the filter and compute 
a restoration step rk for which TKQP{xk + r^, is 

compatible for some Ak-^i > 0, and Xk + rk is acceptable 
for the filter. If this proves impossible, stop. Otherwise, 
define Xk-^i = Xk + rk and go to Step 7. 

Step 3b: Tangential component. Compute a tangential 
step tk and set Sk = rik + tk- 

Step 4: Tests to accept the trial step. 

■ Evaluate c{xk + Sk) and f{xk + Sk). 

■ If Xk + Sk is not acceptable for the filter and Xk-, set Xk-^-i = 
Xk, choose Ak+i G [ 7 oAjt, 7 iAA;], set nk+i = Uk if Step 3 
was executed, and go to Step 7. 

■ If 

mk{xk) - rukixk + Sk) > neOf, ( 2 . 20 ) 

and 

^ def fjxk) - f{xk + Sk) . . 

rrik{xk) - mk{xk + Sk) 

again set Xk+i = Xk, choose A^;+i € [ 7 oAfc, 7 iAfc], set 
= Uk if Step 3 was executed, and go to Step 7. 

Step 5: Test to include the current iterate in the filter. If 

(2.20) fails, include Xk in the filter JT. 
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Step 6: Move to the new iterate. Set = Xk + and 

choose Ak^i such that 

Afc+i e [Ak,'r2^k] if Pk > m and (2.20) holds. 

Step 7: Update the Hessian approximation. Determine 
Increment k by one and go to Step 1. 



This algorithm differs from that of Fletcher et al. (1999) in that it 
contains the alternative step strategy, but also because it allows the 
normal step to satisfy (2.13) with ji — 0 whenever (2.19) holds, that 
is whenever the current iterate is sufficiently feasible. (As we will see 
later, (2.13) with /i > 0 can be viewed as an implicit technique to impose 
(2.19).) 

As in Fletcher and Leyffer (1997) and Fletcher and Leyffer (1998), 
one may choose ^ = 2 (Note that the choice — 1 is always possible 
because // > 0). Reasonable values for the constants might then be 

70 = 0.1, 7 i = 0.5, 72 = 2, r]i = 0.01, 1 J 2 = 0.9, 70 = 10~^, 

ka — 0.7, = 100, fj, = 0.01, Kg = 10“'^, and = 0.01. 

but it is too early to know if these are even close to the best possible 
choices. 

As in Fletcher et al. (1999), some comments on this algorithm are 
now in order. Observe first that, by construction, every iterate x^ must 
be acceptable for the filter at the beginning of iteration irrespective 
of the possibility that it is added to the filter later. Also note that 
the restoration step cannot be zero, that is restoration cannot simply 
entail enlarging the trust-region radius to ensure (2.13), even if n/^ exists. 
This is because Xk is added to the filter before is computed, and Xk+Vk 
must be acceptable for the filter which now contains Xk- Also note that 
the restoration procedure cannot be applied on two successive iterations, 
since the iterate Xk + produced by the first of these iterations is both 
compatible and acceptable for the filter. 

For the restoration procedure in Step 3a to succeed, we have to eval- 
uate whether TRQP{xk + r^^Ak-^-i) is compatible for a suitable value 
of Ak-\-i. This requires that a suitable normal step be computed which 
successfully passes the test (2.13). Of course, once this is achieved, this 
normal step may be reused at iteration A: + 1 , if the composite step 
strategy is used. 
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As it stands, the algorithm is not specific about how to choose 
during a restoration iteration. On one hand, there is an advantage to 
choosing a large A^+i, since this allows a large step and one hopes good 
progress. It also makes it easier to satisfy (2.13). On the other, it may be 
unwise to choose it to be too large, as this may possibly result in a large 
number of unsuccessful iterations, during which the radius is reduced, 
before the algorithm can make any progress. A possible choice might 
be to restart from the radius obtained during the restoration iteration 
itself, if it uses a trust-region method. Reasonable alternatives would be 
to use the average radius observed during past successful iterations, or 
to apply the internal doubling strategy of Byrd et al. (1987) to increase 
the new radius, or even to consider the technique described by Sartenaer 
(1997). However, we recognize that extensive numerical experience will 
remain the ultimate measure of any suggestion at this level. 

The role of condition (2.20) may be interpreted as follows. If this 
condition fails, then one may think that the constraint violation is sig- 
nificant and that one should aim to improve on this situation in the 
future, by inserting the current point in the filter. Fletcher and Leyffer 
(1997) use the term of “0-step” in such circumstances, to indicate that 
the main preoccupation is to improve feasibility . On the other hand, 
if condition (2.20) holds, then the reduction in the objective function 
predicted by the model is more significant than the current constraint 
violation and it is thus appealing to let the algorithm behave as if it were 
unconstrained. Fletcher and Leyffer (1997) use the term of “/-step” to 
denote the step generated, in order to reflect the dominant role of the 
objective function / in this case. In this case, it is important that the 
predicted decrease in the model is realized by the actual decrease in the 
function, which is why we also require that (2.21) does not hold. In 
particular, if the iterate Xk is feasible, then (2.19) and (2.11) imply that 
Xk = x^ and we obtain that 

Ke^t = 0 < mk{x%) - mk{xk + Sfc) = rrik{xk) - mk{xk + Sk). (2.22) 

As a consequence, the filter mechanism is irrelevant if all iterates are 
feasible, and the algorithm reduces to a classical unconstrained trust- 
region method. Another consequence of (2.22) is that no feasible iterate 
is ever included in the filter^ which is crucial in allowing finite termina- 
tion of the restoration procedure. Indeed, if the restoration procedure is 
required at iteration k of the filter algorithm and produces a sequence 
of points {xk^j} converging to feasibility, there must be an iterate 
for which 






(1 - 7e)6'r", — Afc+1 min[l, J 
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= min 9i > 0 
^ iez, i<k 



and 

Z = {k \ Xk is added to the filter}. 

Moreover, must eventually be small enough to ensure, using our as- 
sumption on the normal step, the existence of a normal step from 
Xkj- In other words, the restoration iteration must eventually find an it- 
erate Xkj which is acceptable for the filter and for which the normal step 
exists and satisfies (2.13), i.e. an iterate xj which is both acceptable and 
compatible. As a consequence, the restoration procedure will terminate 
in a finite number of steps, and the filter algorithm may then proceed. 
Note that the restoration step may not terminate in a finite number of 
iterations if we do not assume the existence of the normal step when the 
constraint violation is small enough, even if this violation converges to 
zero (see Fletcher, Leyffer and Toint, 1998, for an example). 

Notice also that (2.20) ensures that the denominator of pk in (2.21) 
will be strictly positive whenever 9^ is. If 9k = 0, then Xj^ = x^^ and the 
denominator of (2.21) will be strictly positive unless Xk is a first-order 
critical point because of (2.16). 

The attentive reader will have observed that we have defined rik+i in 
Step 4 in the cases where iteration k is unsuccessful (just before branch- 
ing back to Step 2), while we may not use it if the alternative step of 
Step 2 is then used at iteration k + 1. This is to keep the expression 
of the algorithm as general as possible: a more restrictive version would 
impose a branch back to Step 3b from Step 4 if iteration k is unsuc- 
cessful, but this would then prevent the use of an alternative step at 
iteration k + 1. We have chosen not to impose that restriction, but we 
obviously require that is used in Step 3a whenever it has been set 
at iteration A:, instead of recomputing it from scratch. 

Finally, note that Step 6 allows a relatively wide choice of the new 
trust-region radius While the stated conditions are sufficient for 

the theory developed below, one must obviously be more specific in 
practice. For instance, one may wish to distinguish, at this point in the 
algorithm, the cases where (2.20) fails or holds. If (2.20) holds, the main 
effect of the current iteration is not to reduce the model (which makes 
the value of pk essentially irrelevant), but rather to reduce the constraint 
violation (which is taken care of by inserting the current iterate in the 
filter at Step 5). In this case. Step 6 imposes no further restriction on 
A/c+i- In practice, it may be reasonable not to reduce the trust-region 
radius, because this might cause too small steps towards feasibility or an 
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unnecessary restoration phase. However, there is no compelling reason to 
increase the radius either, given the compatibility of TKQF{xk^^k)‘ A 
reasonable strategy might then be to choose = A^. If, on the other 
hand, (2.20) holds, the emphasis of the iteration is then on reducing the 
objective function, a case akin to unconstrained minimization. Thus a 
more detailed rule of the type 

A a f [7iAfc,72Afc] if pk e 

I [Afc,72Afc] if Pk > V2 

seems reasonable in these circumstances. 

3. Convergence to First-Order Critical Points 

We now prove that our hybrid algorithm generates a globally con- 
vergent sequence of iterates, at least if the restoration iteration always 
succeeds. For the purpose of our analysis, we shall consider 

S = {k\ Xk+i = Xk + Sk}, 

the set of (indices of) successful iterations, 

{ k I Step 3 is executed and 

either TRQP(a;)fc, Aj^) has no feasible point 
or llrifcll > KAAfcmin[l,K^A^] 

the set of restoration iterations, 

A — {k \ the alternative step is used at iteration /j}, 

and 

C — \^k I Sk — Tik “h tk\ •) 

the set of iterations where a composite step is used (with /i/c > 0). Note 
that (2.19) implies that 

9k < /i:uAfcmin[l, A^] < (3.1) 

for every k E A. Also note that {1, 2, . . .} = U7^ and that TZ C Z. 

In order to obtain our global convergence result, we will use the as- 
sumptions 

ASl: / and the constraint functions cs and cj are twice continuously 
differentiable; 

AS2: there exists > 1 such that 

\\Hk\\ < «umh - 1 < «;umh for all k, 
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ASS: the iterates {x^} remain in a closed, bounded domain X C IR^. 
If, for example, is chosen as the Hessian of the Lagrangian function 
^{x,y) = f{x) + {ys,ce{x)) + {yx,cj{x)) 

at Xk, in that 

Hk = V 

XX f{xk)+ Y. xx^iiA^k\ (^*2) 

ieSiJX 

where [yk]i denotes the i-th component of the vector of Lagrange mul- 
tipliers 2 /^ = [Ue k yxk)'> ^hen we see from ASl and ASS that AS2 is 
satisfied when these multipliers remain bounded. The same is true if the 
Hessian matrices in (3.2) are replaced by bounded approximations. 

A first immediate consequence of AS1-AS3 is that there exists a con- 
stant /i:^bh > 1 such that, for all fc, 

\f{p^k “b — rrik{Xk + ^k)\ ^ f^nhh^k' (^*^) 

A proof of this property, based on Taylor expansion, may be found, 
for instance, in Toint (1988). A second important consequence of our 
assumptions is that ASl and AS3 together directly ensure that, for all 
A;, 

< f{xk) < and Q<9k< 0^^^ (3.4) 

for some constants and > 0. Thus the part of the (0, /)- 

space in which the (0, /)-pairs associated with the filter iterates lie is 
restricted to the rectangle 

Mo - [0,9--] X 

whose area, surf(A4o), is clearly finite. 

We also note the following simple consequence of (2.11) and AS3. 

Lemma 1 Suppose that Algorithm 2.1 is applied to problem (1.1). 
Suppose also that (2.11) and AS3 hold, that k e C, and that 

9k 



Then there exists a constant > 0 independent of k such that 

^isA < ||^A:||- (3.5) 
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Proof. Since k E we first obtain that exists (as a consequence 
of (2.11)), and define 

Vfc {j e£ \ 9k = \cj{xk)\) [j{j ^^\9k = -Cj{xk)]., 

that is the subset of most-violated constraints. Prom the definitions 
of 9k in (2.2) and of the normal step in (2.7) we obtain, using the 
Cauchy-Schwartz inequality, that 

9k ^ ^ (^/c) II ll^/cll (^*6) 

for all j E Vk- But ASS ensures that there exists a constant > 0 

such that 

max max ||Va;C.-(a;)|| — . 

We then obtain the desired conclusion by substituting this bound in 
(3.6). □ 

Our assumptions and the definition of Xk in (2-14) ensure that 9k and 
Xk can be used (together) to measure criticality for problem (1.1). 



Lemma 2 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl 
and ASS hold, and that there exists a subsequence {ki} % TZ such 
that 

lim 9ki = 0, lim Xh = 0 and lim tt/,. = 0. (3.7) 

i — >-00 i— )-oo 1-400 

kiec kieA 

Then every limit point of the sequence {xk^} is a first-order critical 
point for problem (1.1). 



Proof. Consider a limit point of the sequence {xk^^}-, whose 
existence is ensured by ASS, and assume that {ki} C {ki} is the index 
set of a subsequence such that {xk^} converges to x^. If {k^} contains 
infinitely many indices of the definition of tt^ implies that x^ is a 
first-order critical point for problem (1.1). If this is not the case, the 
fact that ki ^TZ implies that Uk^ satisfies (2.11) for sufficiently large i 
and converges to zero, because {9k^} converges to zero and the second 
part of this condition. As a consequence, we deduce from (2.12) that 
also converges to x^. Since the minimization problem occuring 
in the definition of Xki (in (2-14)) is convex, we then obtain from 
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classical perturbation theory (see, for instance, Fiacco, 1983, pp. 14- 
17), ASl and the first part of (3.7) that 

I min {g^,t)\ = 0. 

As{xAt=0 

cx(x:,)+Ax(xAt>0 

m\<i 

This in turn guarantees that is first-order critical for problem (1.1). 

□ 

We start our analysis by examining what happens when an infinite num- 
ber of iterates (that is, their (0, /)-pairs) are added to the filter. 



Lemma 3 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl 
and AS3 hold and that \Z\ = oo. Then 

lim 6k = 0. 

k—¥oo 

kez 



Proof. Suppose, for the purpose of obtaining a contradiction, that 
there exists an infinite subsequence {ki} C Z such that 

6k, > e (3.8) 

for all i and for some e > 0. At each iteration the (0, /)-pair 
associated with Xk-^ that is (dki^fki)^ is added to the filter. This 
means that no other (0, /)-pair can be added to the filter at a later 
stage within the square 



[Oki -79€,eki] X [fk^ -'re^Jki], 



or with the intersection of this square with Mq. But the area of each 
of these squares is 7|e^. Thus the set M.q is completely covered by at 
most a finite number of such squares. This puts a finite upper bound 
on the number of iterations in {%}, and the conclusion follows. □ 

We next examine the size of the constraint violation before and after a 
‘‘composite iteration” where restoration did not occur. 
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Lemma 4 Suppose that Algorithm 2.1 is applied to problem (1.1). 
Suppose also that ASl and ASS hold and that satisfies (3.5) for 
k ^ C. Then there exists a constant > 0 such that 





9k < 


(3.9) 


and 








9{xk + Sk) < KubtAfc. 


(3.10) 


for all 1 ^ 7^. 







Proof. Assume first that k E C with fik = 1^- Since k ^ TZ, we have 
from (3.5) and (2.13) that 

(3.11) 

which gives (3.9). On the other hand, (3.1) implies that an inequality 
of the form (3.9) holds for A; G v4. or A; G C with /i^ = 0. Now, for any 
A;, the z-th constraint function at Xk + can be expressed as 

^i{^k ^k) ~ ^i{^k) “h ^k^k) “t" 2 

for i G UX, where we have used ASl, the mean- value theorem, and 
where belongs to the segment [xj^^Xk + Sk]- Using ASS, we may 
bound the Hessian of the constraint functions and we obtain from 
(2.9), the Cauchy-Schwartz inequality, and (2.8) we have that 

\ci{xk + Sk)\< ^ max \\VxxCi (^)ll <kiA|, 



if z G or 



-Ci{xk + Sk) < imax||Va;a;Ci(a;)|| \\sk\f < kiA|, 

xex 

if z G X, where we have defined 
This gives the desired bound for any 

^ubt > max[Ki,«:„,/«AK^i/Kisc]. 



□ 
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We next assess the model decrease when the trust-region radius is suffi- 
ciently small. 



Lemma 5 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3 and (2.16) hold, that /j E C, that, for some e > 0, 



Xk > e. 



(3.12) 



Suppose furthermore that 



Ai. < min 



1, 



e (2. 



^umh V ^umh 






(M n 
— 



def 



(3.13) 



where = max^^x \\'^xf{x)\\. Then 

rrik{xk) - mk{xk + Sk) > ^K.^^eAk- 

This last inequality also holds if A; € ^, if (3.13) holds and 

TTk > e. (3-14) 



Proof. Assume first that k € C. We note that, by (2.16), AS2, 
(3.12) and (3.13), 



’mkix'k) - rrikixk + Sfe) > KtmdXfc 



Xk 



) ^k 



> «,„d€Afc. (3.15) 



Now 

ixikixk) = rukixk) + {gk,nk) + \{nk,Hknk) 
and therefore, using the Cauchy-Schwartz inequality, AS2, (2.13) and 

(3.13) that 

\rrik{xk) - mk{xl)\ < ||riA:|| Ibfcll + ||nfc||^ 

^ ^ubgll^fcll + l^umhll^fclP 

< iK,„aeAfc. 



We thus conclude from this last inequality and (3.15) that the desired 
conclusion holds for A: G C. If we now assume that k ^ A (that 
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is iteration k uses an alternative step), then (2.18), (3.13) and the 
inequality > 1 directly yields that 



rrikixk) - rukixk + Sk) > Kt^d min[e, A^] > \n,^A^k 



as desired. □ 

We continue our analysis by showing, as the reader has grown to expect, 
that iterations have to be very successful when the trust-region radius 
is sufficiently small. 



Lemma 6 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3, (2.16) and (3.12) hold, that k and that 



A/c < min 



(1 - r?2)«tmde 




(3.16) 



Then 



Pk > m- 



Proof. Using (2.21), (3.3), Lemma 5 and (3.16), we find that 

I _i\ = \fi^k + Sk) -mk{xk + Sk)\ ^ KubhA| ^ ^ _ 

\mk{xk) - mk{xk + Sk)\ ~ kn.^^eAk ~ 

from which the conclusion immediately follows. □ 

Note that this proof could easily be extended if the definition of pk in 
(2.21) were altered to be of the form 

djf f{Xk)~ f{Xk+Sk)+@k .g 

mk{xk) - rukixk + Sk) 



provided &k is bounded above by a multiple of A|. We will comment in 
Section 4 why such a modification might be of interest. 

Now, we also show that the test (2.20) will always be satisfied when 
the trust-region radius is sufficiently small. 
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Lemma 7 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3, (2.16) and (3.12) hold, that k ^ TZ^ that Uk satisfies (3.5) if 
k £ and that 



A/c < min 



^7715 



2iKq Hvukt 



1 

V'(H-m)-! 




(3.18) 



Then 



mk{xk) - mk{xk + Sk) > KgOf. 



Proof. This directly results from the inequalities 



ngOt < < mk{xk) - mk{xk + Sfc), 



where we successively used Lemma 4, (3.18) and Lemma 5. □ 

We may also guarantee a decrease in the objective function, large enough 
to ensure that the trial point is acceptable with respect to the (0, /)-pair 
associated with so long as the constraint violation is itself sufficiently 
small. 



Lemma 8 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that ffnite termination does not occur. Suppose also that ASl- 
AS3, (2.16), (3.12) and (3.16) hold, that k ^ TZ^ that satisfies 
(3.5) if A; G C, and that 



Then 



Ok < kJ " = So- (3.19) 

f{xk + Sk) < f(xk) - JgOk. 



Proof. Applying Lemmas 4-6 — which is possible because of (3.12), 
(3.16), k ^ TZ and Uk satisfies (3.5) for k E C — and (3.19), we obtain 
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that 

f{xk) - f{xk + Sh) > m[mk{xk) - mkixk + Sk)] 

> |r?2K.„de 

> 

and the desired inequality follows. □ 

We now establish that if the trust-region radius and the constraint vi- 
olation are both small at a non-critical iterate TRQP(rrj^, Aj^) must 
be compatible. 



Lemma 9 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose that AS1-AS3, 
(2.11), and (3.12) hold, that (2.16) holds for k and that 



A/c < min 



70 







1-H 






Suppose furthermore that 



(3.20) 



9k < min[J<9,^yi]. 



Then k ^TZ. 



(3.21) 



Proof. If an alternative step is used at iteration A:, then k ^ TZ. 
Assume therefore that k ^ A. Because 9^ < 6n^ we know from (2.11) 
and Lemma 1 that (2.11) and (3.5) hold. Moreover, since 9^ < 
we have that (3.19) also holds. Assume, for the purpose of deriving a 
contradiction, that k E TZ^ which implies that 

llnfcll > (3.22) 

where we have used (2.13) and the fact that ac^zA^^ < ac/zA^ < 1 
because of (3.20). In this case, the mechanism of the algorithm then 
ensures that k — 1 ^TZ. Now assume that iteration A: — 1 is unsuccess- 
ful. Because of Lemmas 6 and 8, which hold at iteration k — 1 ^ TZ 
because of (3.20), the fact that 9^ = 9k-i^ (2.11), and (3.19), we 
obtain that 

Pk-i > m and f{xk-i + Sk-i) < f{xk-i) - Je^k-i- 




Global Convergence of a Hybrid Trust-Region SQP-Filter Algorithm 



45 



Hence, given that Xk-i is acceptable for the filter at the beginning of 
iteration A: — 1, if this iteration is unsuccessful, it must be because 

9{xk-i + Sfc-i) > (1 - jeWk-i = (1 - jeWk- 
But Lemma 4 and the mechanism of the algorithm then imply that 

(1 - 'feWk < 

7o 

Combining this last bound with (3.22) and (2.11), we deduce that 

< linfcll < < ~Y^Al 

and hence that 

J^i-n ^ 7o(l 

^usc ^ubt 

Since this last inequality contradicts (3.20), our assumption that it- 
eration A; — 1 is unsuccessful must be false. Thus iteration A; — 1 is 
successful and 6^ = 9{xi^^i + 5a;-i). We then obtain from (3.22), 
(2.11) and (3.10) that 

K^KfiAl'^^ < llnfell < < «usc«„b.A|_i < 

7o 

which is again impossible because of (3.20) and because (1 — je) < 1* 
Hence our initial assumption (3.22) must be false, which yields the 
desired conclusion. □ 

We now distinguish two mutually exclusive cases. For the first, we con- 
sider what happens if there is an infinite subsequence of iterates belong- 
ing to the filter. 



Lemma 10 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3 and (2.11) hold and that (2.16) holds for k ^ TZ. Suppose 
furthermore that \Z\ = oo. Then there exists an infinite subsequence 
{kj} C Z such that 

lim 9k, = 0 (3.23) 

j^oo ^ 

lim Xki = 0 and lim tt*,, = 0. 

i— >oo j—^oo 

kjEiC kj^>A 



and 



(3.24) 
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Proof. Let {ki} be any infinite subsequence of Z. We observe that 
(3.23) follows from Lemma 3. Suppose now that, for some C2 > 0. 

Xki > (3.25) 

for all i such that ki E C and 

TTfc. > €2 (3.26) 

for all i such that ki G A. Suppose furthermore that there exists 
63 > 0 such that, for all i > z'o, 

Ak, > 63. (3.27) 

If ki ^ wA, (3.23) and (2.11) ensure that n^. exists for i > say, and 
also that 

lim IlnfcJI = 0. (3.28) 

l-^OO 

Thus (3.27) ensures that (2.13) holds for sufficiently large i and ki 0 
TZ. We may then decompose the model decrease in its normal and 
tangential components, that is 

rriki (xk) -mki {xk^ +Sk) = TUk^ (a;fc.) -ruki {xl. ) +mki (x^) - ruk^ (xk^ +Sk^ . 

(3.29) 

Consider the normal component first. As we noted in the proof of 
Lemma 5, 

\mki{xki) - mkiixl.)\ < Kubglln/fcJI + 

which in turn, with (3.28), yields that 

lim [mfc. {xki ) - TUki )] = 0. (3.30) 

If we now consider the normal component, (3.25), (3.27) (2.16) and 
AS2 yield that 

mkiixki) - rrikiixki + Ski) > «tmde 2 min -^,€3 > 0. (3.31) 

L^umh 

Substituting (3.30) and (3.31) into (3.29), we find that, for ki G C, 

f^kiixki) - rrikiixki + Ski) > 61 > 0 - 

If, on the other hand, ki € A, then (3.26), (3.27) and (2.18) give that 

mkiixki) - rukiixki + Ski) > min[e2, €3] S2 > 0. 
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lim inf[mfc. (rcfc. ) - mti (xki + Sfc; )] > min[^i , ^ 2 ] = = 5 > 0. (3.32) 



We now observe that, because is added to the filter at iteration 
the mechanism of the algorithm imposes that either iteration ki E TZ 
or (2.20) must fail. Since we already verified that ki ^ IZ for i > 
sufficiently large, we obtain that (2.20) must fail for such z, that is 



mkiixki) - mkiixk- + Ski) < 



(3.33) 



Combining this bound with (3.32), we find that is bounded away 
from zero for i sufficiently large, which is impossible in view of (3.23). 
We therefore deduce that (3.27) cannot hold and obtain that there is 
a subsequence {ki] C {ki} for which 



lim = 0. 

i^oo 



We now restrict our attention to the tail of this subsequence, that is to 
the set of indices ki that are large enough to ensure that (3.18), (3.19) 
and (3.20) hold, which is possible by definition of the subsequence and 
because of (3.23). For these indices, we may therefore apply Lemma 9, 
and deduce that iteration k^ ^ 7Z for £ sufficiently large. Hence, as 
above, (3.33) must hold for £ sufficiently large. However, we may also 
apply Lemma 7, which contradicts (3.33), and therefore (3.25) and 
(3.26) cannot hold together, yielding the desired result. □ 

Thus, if an infinite subsequence of iterates is added to the filter. Lemma 2 
ensures that it converges to a first-order critical point. Our remaining 
analysis then naturally concentrates on the possibility that there may be 
no such infinite subsequence. In this case, no further iterates are added 
to the filter for k sufficiently large. In particular, this means that the 
number of restoration iterations, \7Z\^ must be finite. In what follows, 
we assume that /jq > 0 is the last iteration for which x^^-i is added to 
the filter. 
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Lemma 11 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3 and (2.11) hold and that (2.16) holds for k ^TZ. Then we have 
that 

lim 9k - 0. (3.34) 

/c->oo 

Furthermore, exists and satisfies (3.5) for all k > ko sufficiently 
large. 



Proof. Consider any successful iterate with k > ko. Then we have 
that 

fi^k) - f{xk+i) > mirrikixk) - mkixk + Sk)] > > 0. (3.35) 

Thus the objective function does not increase for all successful iter- 
ations with k > ko- But ASl and AS3 imply (3.4) and therefore we 
must have, from the first part of this statement, that 

lim f{xk) - f{xk+i) = 0. (3.36) 

kes 

k^oo 

(3.34) then immediately follows from (3.35) and the fact that 9j = 9k 
for all unsuccessful iterations j that immediately follow the successful 
iteration /c, if any. The last conclusion then results from (2.11) and 
Lemma 1. □ 

We now show that the trust-region radius cannot become arbitrarily 
small if the (asymptotically feasible) iterates stay away from first-order 
critical points. 



Lemma 12 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl — 
AS3 hold and that (2.16) holds for k ^TZ. Suppose furthermore that 
(3.12) hold for all k > ko- Then there exists a Amin > 0 such that 

^ Amin 

for all k. 



Proof. Suppose that k\ > ko is chosen sufficiently large to ensure 
that (3.21) holds and thus that (2.11) also holds for all k > /ji, which 
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is possible because of Lemma 11. Suppose also, for the purpose of ob- 
taining a contradiction, that iteration j is the first iteration following 
iteration k\ for which 

Aj < 7o min 6 

where 

0 = mm Oi 
i^Z 

is the smallest constraint violation appearing in the filter. Note also 
that the inequality Aj < 7 oA/c^, which is implied by (3.37), ensures 
that j > ki~\- I and hence that j — I > k\ and thus that j — 1 ^ TZ. 
Then the mechanism of the algorithm and (3.37) imply that 

Aj_i < —A,- < Ss (3.38) 

To 

and Lemma 6, which is applicable because (3.37) and (3.38) together 
imply (3.16) with k replaced by j — 1, then ensures that 

Pj-i > m- (3-39) 

Furthermore, since n — j — 1 satisfies (2.11), Lemma 1 implies that 
we can apply Lemma 4. This together with (3.37) and (3.38), gives 
that 

0{xj-i + Sj-i) < < (1 - (3.40) 

We may also apply Lemma 8 because (3.37) and (3.38) ensure that 
(3.16) holds and because (3.19) also holds for j — 1 > k\. Hence we 
deduce that 

f{xj-i + Sj-i) < f{xj-i) - 70%-i- 

This last relation and (3.40) ensure that xj-i + sj-i is acceptable 
for the filter and xj-i. Combining this conclusion with (3.39) and 
the mechanism of the algorithm, we obtain that > Aj-i. As a 
consequence, and since (2.20) also holds at iteration j — 1, iteration j 
cannot be the first iteration following k\ for which (3.37) holds. This 
contradiction shows that A/^ > 70^5 for all k > k\^ and the desired 
result follows if we define 

Amin = min[Ao, . . .,Ak^,^o5s]. 




□ 
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We may now analyze the convergence of Xk itself. 



Lemma 13 Suppose that Algorithm 2.1 is applied to problem (1.1) 
and that finite termination does not occur. Suppose also that ASl- 
AS3, (2.11) hold, and that (2.16) holds for k ^TZ. Then there exists 
a subsequence {kj} such that 

liminfx)^. —0 and limmink- == 0. (3-41) 

i-)-oo ^ j-^OO ^ 

kjec kjeA 



Proof. We start by observing that Lemma 11 implies that the second 
conclusion of (2.11) holds for k sufficiently large. Moreover, as in 
Lemma 11, we obtain (3.35) and therefore (3.36) for each A; G 
k > ko. Suppose now, for the purpose of obtaining a contradiction, 
that (3.12) and (3.14) hold. Assume first that k E C. In this case, 
and notice that 



mk{xk) - mk{xk + Sk) = mk{xk) - mk{xl) + mk{xl) - mk{xk + Sfc). 

(3.42) 



Moreover, note, as in Lemma 5, that 



\mk{xk) - mk{x%)\ < + «umh||nA;|P, 



which in turn yields that 



lim [rrikixk) - mk{x^)] = 0 

AC— >-00 

because of Lemma 11 and the second conclusion of (2.11). This limit, 
together with (3.35), (3.36) and (3.42), then gives that 



lim [mA;(a;^) - mk{xk + Sk)] = 0. (3.43) 

k-^oo 

kes 

But (2.16), (3.12), AS2 and Lemma 12 together imply that, for all 
k > ko 



mkixl)-mk{xk+Sk) > ^tmd Xk min 



> 


> K.„aemin 


‘ 6 ■ 
7 ^min 


ipk J 




. ^umh 






(3.44) 



immediately giving a contradiction with (3.43). 

On the other hand, if A; G w4, then (3.14) and (2.18) immediately 
imply that 



5 



mk{xk) - rukixk + Sk) > «tmd min[e, Amin] > 0, 
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which, together with ( 2 . 21 ) and the fact that k e contradicts the 
boundedness of /. Hence (3.12) and (3.14) cannot hold together and 
the desired result follows. □ 

We may summarize all of the above in our main global convergence 
result. 



Theorem 14 Suppose that Algorithm 2.1 is applied to prob- 
lem ( 1 . 1 ) and that finite termination does not occur. Suppose also 
that AS1-AS3 and ( 2 . 11 ) hold, and that (2.16) holds for k ^TZ. Let 
{xk} be the sequence of iterates produced by the algorithm. Then 
either the restoration procedure terminates unsuccessfully by con- 
verging to an infeasible first-order critical point of problem (2.17), 
or there is a subsequence {kj} for which 

lim Xk- = x^ 

>oo 

and x^ is a first-order critical point for problem ( 1 . 1 ). 



Proof. Suppose that the restoration iteration always terminates 
successfully. From AS3, Lemmas 10, 11 and 13, we obtain that, for 
some subsequence {fcj}, 

lim dk. = lim Xkj = lim 7 rjt = 0. (3.45) 

>00 j^oo j-^oo 

kj^C 

The conclusion then follows from Lemma 2. □ 

Can we dispense with AS3 to obtain this result? Firstly, this assump- 
tion ensures that the objective and constraint functions remain bounded 
above and below (see (3.4)). This is crucial for the rest of the analy- 
sis because the convergence of the iterates to feasibility depends on the 
fact that the area of the filter is finite. Thus, if AS3 does not hold, we 
have to verify that (3.4) holds for other reasons. The second part of 
this statement may be ensured quite simply by initializing the filter to 
( 0 max^ _oc), for some in Step 0 of the algorithm. This has the 

effect of putting an upper bound on the infeasibility of all iterates, which 
may be useful in practice. However, this does not prevent the objective 
function from being unbounded below in 
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and we cannot exclude the possibility that a sequence of infeasible iter- 
ates might both continue to improve the value of the objective function 
and satisfy (2.20). If is bounded, ASS is most certainly satisfied. 

If this is not the case, we could assume that 

< f{x) < and 0 < 6{x) < for x G C{9^^^) (3.46) 

for some values of and and simply monitor that the values f{xk) 
are 

reasonable — in view of the problem being solved — as the algorithm pro- 
ceeds. To summarize, we may replace ASl and ASS by the following 
assumption. 

AS4: The functions / and c are twice continuously differentiable on an 
open set containing C(0"^^^), their first and second derivatives are 
uniformly bounded on and (3.46) holds. 

The reader should note that AS4 no longer ensures the existence of 
a limit point, but only that (3.45) holds for some subsequence {kj}. 
Furthermore, the comments following the statement of (2.11) no longer 
apply if limit points at infinity are allowed. 

4. Conclusion and Perspectives 

We have introduced a hybrid trust-region SQP-filter algorithm for 
general nonlinear programming, that mixes composite steps with poten- 
tially cheaper alternative steps, and we have shown this algorithm to be 
globally convergent to first-order critical points. This hybrid algorithm 
has the potential of being numerically more efficient than its version that 
only uses composite steps, as analyzed in Fletcher et al. (1999). How- 
ever, the authors are well aware that this potential must be confirmed 
by numerical experiments. 
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Abstract This paper is an extended abstract of a survey talk given at the IFIP 
TC7 Conference in Trier, July 2001. We consider linear and nonlinear 
semidefinite programming problems and concentrate on selected aspects 
that are relevant for understanding dual barrier methods. The paper is 
aimed at graduate students to highlight some issues regarding smooth- 
ness, regularity, and computational complexity without going into de- 
tails. 

Keywords: Semidefinite programming, smoothness, regularity, interior method, lo- 
cal minimization. 

1. Introduction 

In this paper we consider nonlinear semidefinite programming prob- 
lems (NLSDP’s) and concentrate on some aspects relevant to a dual 
barrier method. Other approaches for solving NLSDP’s are the program 
package LOQO of Vanderbei (1997) based on a primal-dual approach, 
or recent work of Vanderbei et.al. (2000). Also the work of Kocvara 
and Stingl (2001) solving large scale semidefinite programs based on a 
modified barrier approach seems very promising. The modified barrier 
approach does not require the barrier parameter to converge to zero and 
may thus overcome some of the problems related to ill-conditioning in 
traditional interior methods. Optimality conditions for NLSDP’s are 
considered in Forsgren (2000); Shapiro and Scheinberg (2000). 

Some problems considered in this paper do not satisfy any constraint 
qualification. For such problems primal-dual methods do not appear to 
be suitable. Another question addressed in this paper is the question 
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of how to avoid “poor” local minimizers, a question that may be even 
more difficult to investigate for primal dual methods than it is for barrier 
methods. 

1.1 Notation 

The following notation has become standard in the literature on linear 
semidefinite programs. The space of symmetric n x n- matrices is denoted 
by The inequality 

X ^ 0, {X yO) 

is used to indicate that X is a symmetric positive semidefinite (positive 
definite) n x n-matrix. By 

{C,X) = C ^ trace(C^X) = ^ CijXij 

hj 

we denote the standard scalar product on the space of n x n-matrices 
inducing the Frobenius norm, 

x.x = \\x\\l. 

For given symmetric matrices we define a linear map A from <S” to 
by 

/ • X \ 

^(x) = : 

.xy 

The adjoint operator A* satisfying 

(X(y),X) = y^A{X) yXeS^,y€lR^ 

is given by 

m 

A*{y) = J2yiA^^K 

i=l 

2. Linear semidefinite programs 

In this section we consider a pair of primal and dual (linear) semidef- 
inite programs in standard form, 

(P) minimize C • X s.t. A(X) = b, X ^ 0 

and 

(D) maximize b^y s.t. A*{y) + S = C, S y 0. 
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When using the notion “semidefinite program” (SDP) we always refer 
to a linear semidefinite program; nonlinear SDP’s will be denoted by 
NLSDP. The data for (P) and (P) are a linear map a vector b G 
and a matrix C ^ S'^. We use the convention that the infimum of (P) 
is +OC whenever (P) does not have any feasible solution X, and the 
supremum of (P) is — oc if the feasible set of (P) is empty. If there 
exists a matrix X y 0 (not just X ^ 0) that is feasible for (P) then 
we call X “strictly” feasible for (P) and say that (P) satisfies Slaters 
condition. Likewise, if there exists a matrix S y 0 that is feasible for 
(P) we call (P) strictly feasible. If Slaters condition is satisfied by (P) 
or by (P) then the optimal values of (P) and (P) coincide, and if both 
problems satisfy Slaters condition, then the optimal solutions X^^^ and 
yopt^ gopt both problems exist and satisfy the equation 

^OptgOpt ^ 

Conversely, any pair X and y, S of feasible points for (P) and (P) satis- 
fying (1) is optimal for both problems, see e.g. Shapiro and Scheinberg 
(2000). Condition (1) implies that there exists a unitary matrix U that 
simultaneously diagonalizes X^^^ and 5^^^. Moreover, the eigenvalues of 
Xopt gopt same eigenvector are complementary. 

The two main applications of semidefinite programs are relaxations for 
combinatorial optimization problems, see e.g. Alizadeh (1991); Helmberg 
et.al. (1996); Goemans and Williamson (1994), and semidefinite pro- 
grams arising from Lyapunov functions or from the positive real lemma 
in control theory, see e.g. Boyd et.al. (1994); Leibfritz (2001); Scherer 
(1999). Next, we give two simple examples for such applications. 

2.1 A first simple example 

In our first example we consider the differential equation 

x{t) = Ax{t) 

for some vector function x : IR ^ IBP, By definition, this system is called 
stable if for all initial values x^^^ — x(0) the solutions x(t) converge to 
zero when t oo. It is well known, see e.g. Hirsch and Smale (1974), 

that this is the case if and only if the real part of all eigenvalues of A is 
negative, 

Re(Ai(A)) <0 for 1 < i < n. 

By Lyapunov’s theorem, this is the case if and only if 



3PyO: -A^P -PAyQ. 




58 



Let us now assume that the system matrix A is subject to uncertainties 
that can be “confined” to a polyhedron with m given vertices i.e. 

A = A(t) € conv{^(^) , . . . , } for t> 0. 

In this case the existence of a Lyapunov matrix P 0 with 

-(A(^)fp - PA(^) yO tor l<i<m (2) 

implies that 

-A{tfP - PA{t) y 0 for all t > 0, 

and hence, 

0 > x{t)^ (^A{t)'^P + PA{t)^ x{t) = {A{t)x{t))^ Px{t)+x{t)^P {A{t)x{t)) 

= I {MtfPMt}) = J^\W)\?P 

whenever x(t) / 0. This implies that ||3;(t)||p -> 0, and hence the 
existence of a matrix P y 0 satisfying (2) is a sufficient condition to 
prove stability of the uncertain system. (The above argument only shows 
that ||a;(t)||p is monotonously decreasing. In order to show that ||rr(t)||p 
converges to zero, one can find a strictly negative bound for J^||^(^)||p 
using the largest real part of the eigenvalues of {A^'^^Y'P + PA^'^\) 
There are straightforward ways to formulate the problem of finding 
a matrix P y Q satisfying (2) as a linear semidefinite program, see 
e.g. Boyd et.al. (1994). While this simple example results in a linear 
semidefinite program, other problems from controller design often result 
in bilinear semidefinite programs that are no longer convex, see e.g. 
Leibfritz (2001); Scherer (1999); Freund and Jarre (2000). 

2.2 A second simple example 

Binary quadratic programs (also known as max-cut-problems) have 
few applications in VLSI layout or in spin glass models from physics. 
Their most important property, however, appears to be the fact that 
these problems are A/*P-complete (and hence, there is no known poly- 
nomial time method for solving these problems). What makes these 
problems so appealing is that they appear to be quite easy. 

Let 



MC = conv i^^xx^ I Xi G {±1} for 1 < i < C 

be the max cut poly tope. Hence, MC is the convex hull of all rank-1 
matrices generated by ±1- vectors x. Any binary quadratic program or 
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minimize C • X s.t. X G MC. (3) 

This is a standard linear program with the drawback that the feasible set 
MC is defined as convex hull of exponentially many points xx^ ^ rather 
than being defined by (a polynomial number of) linear inequalities. 

Let e = (1, . . . , 1)^ be the vector of all ones. It is straightforward to 
see that MC can be written in the form 

MC = conv {{XyQ \ diag(X) = e, rank(X) = 1}) . 

Due to the condition diag(X) = e the set MC lies in an affine subspace 
of S'^ of dimension n{n — l)/2. MC has 2^“^ vertices that are pairwise 
adjacent i.e. connected by an edge (a 1-dimensional extreme set of MC). 

Note that the constraints of this second definition of MC appear to 
be smooth constraints; a semidefiniteness constraint, a linear constraint, 
and a rank condition. These conditions, however, imply that there are 
only finitely many “discrete” elements of which the convex hull is taken. 
In some sense the constraints contain a hidden binary constraint allowing 
only certain matrices with entries ±1. When the rank constraint is 
omitted, we obtain the standard SDP relaxation of the max-cut problem, 

SVV ^{XyO \ diag(X) = e} 

satisfying MC C SVV. A relaxed version of (3) is thus given by 

minimize C • X s.t. X G SVV. (4) 

This problem is a linear SDP of the form (P) and can be solved ef- 
ficiently using, for example, interior point methods, see e.g. Helmberg 
et.al. (1996). Goemans and Williamson (1994) have shown how to ob- 
tain an excellent approximation of the max-cut problem (3) using the 
solution X of (4). 

A quite interesting inner approximation of MC leading to a nonlinear 
semidefinite program is described in Chapter 3.3. 

2.3 Smoothness of semidefiniteness constraint 

To understand the complexity of nonlinear semidefinite programs we 
briefly address the question of smoothness and regularity of the semidef- 
inite cone. The set of positive semidefinite matrices can be characterized 
in several different forms. 
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= {X I A^i„(X) > 0} 

= {X I Xi{X) > 0 for 1 < i < n} 

= |x I uF Xu > 0 for all u € iR” (||m|| = 1)| 

= {X I det(Xs,E) > 0 for all S C {1, . . . , n}} 

= {X I : X = Z2}. 

The first characterization uses the smallest eigenvalue \min{X) of X. 
This is a nonsmooth representation. When ordering the eigenvalues in 
a suitable way, the eigenvalues Ai(X) used in the second representa- 
tion have directional derivatives, but are not totally differentiable. The 
third representation is based on a semi-infinite constraint. Prom this 
representation one can easily deduce, for example, that {X | X ^ 0} is 
convex. The fourth representation is based on a finite (but exponential) 
number of smooth constraints, requiring all principal subdeterminants 
to be nonnegative. This representation certainly justifies the claim that 
{X I X ^ 0} is bounded by smooth constraints. As shown in Pataki 
(2000), the tangent plane to {X | X ^ 0} at a point X is given as fol- 
lows. Let 

X = UDU'^ 

with a diagonal matrix D and a unitary matrix U . If X is a boundary 
point of {X I X ^ 0} we may assume without loss of generality that the 
first k diagonal entries of D satisfy d\ — = dk = ^ and • • • 9 > 

0. Let AX be given by 

AX = C/f° *\u'^ 

where the 0-block in the matrix on the right hand side is of size k x 
and the entries * are any entries of suitable dimension. All matrices AX 
of the above form belong to the tangent space of {X | X ^ 0} at X. 
The fourth representation also leads to the convex barrier function 

$(X) = -logdet(X) 

for the positive semidefinite cone. For this barrier function it is sufficient 
to consider S = {1, . . . ,n}, and to set $(X) = 00 whenever X is not 
positive definite. 

The last representation is a projection of a quadratic equality con- 
straint. 

Most, if not all, of the above representations have been used numeri- 
cally to enforce semidefiniteness of some unknown matrix X. 

The set {X | X ^ 0} certainly satisfies Slaters condition, or, in the 
context of nonconvex minimization, any point X G {X | X ^ 0} triv- 
ially satisfies the constraint qualification by Robinson. However, the 
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fourth representation above does not satisfy LICQ. (LICQ is a common 
regularity condition requiring that the active constraints at any point are 
linearly independent, see e.g. Wright and Nocedal (1999).) In fact, (for 
n > 1) there does not exist any representation of the positive semidef- 
inite cone by nonlinear inequalities that do satisfy LICQ. Nevertheless 
the positive semidefinite cone and its surface are numerically tractable, 
and may be considered as a regular set with smooth constraints. 

2.4 A dual barrier method 

We consider problem {D) and eliminate the slack variable S to obtain 
the problem 

maximize b^y s.t. C — v4.*(y) ^ 0. 

For y G with C — A^{y) ^ 0 we then define a convex barrier function 

$(y) = - log (det(C - A*{y))) . 

A plain dual barrier method can be stated as follows: 

Dual barrier method 

Start: Find with C - ^ 0. 

For A; = 1, 2, 3, ... 

■ Set (ik = 10“^ and find 

w y{y,k) = argmin + $(y) 

y 

by Newton’s method with line search starting at 

Of course, this conceptual method needs many refinements such as an 
appropriate choice of the starting point and a somewhat more sophisti- 
cated update of With such minor modifications, however, the above 
algorithm solves the semidefinite programming problem in polynomial 
time. (The notion of polynomiality in the context of nonlinear pro- 
gramming is to be taken with care; the solution of a linear semidefinite 
program can have an exponential “size” like an optimal value of 2^"" 
for a semidefinite program with encoding length 0(n). Our reference 
to “polynomial time” is meant that the method reduces some primal 
dual gap function in polynomial time, see e.g. Nesterov and Nemirovski 
(1994).) 

The key elements in guaranteeing the theoretical efficiency of the bar- 
rier method rest on two facts: 

■ The duality gap (or some linear measure of closeness to optimality) 
is of order /i. 
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■ and the Hessian of the barrier function satisfies a local relative 

Lipschitz condition. 

Both facts were shown by Nesterov and Nemirovski (1994) and rest on 
two conditions introduced in Nesterov and Nemirovski (1994). The first 
fact is implied by a local Lipschitz condition of $ with respect to the 
norm induced by V^$(y), and the second fact is called self-concordance, 
and implies that Newton’s method converges globally at a fixed rate. 
More details can be found in Nesterov and Nemirovski (1994); Jarre 
(1996). 

The guaranteed convergence results in these references are much slow- 
er than what is observed in implementations of related methods. In fact, 
these theoretical results are much too slow to be relevant for practical 
applications. However, these results guarantee a certain independence of 
the method from the data of the problem. Even with exact arithmetic, 
the performance of the steepest descent method for unconstrained mini- 
mization, for example, depends on the condition number of the Hessian 
matrix at the optimal solution. Unlike the steepest descent method, the 
worst case bound for the barrier method only depends on the dimension 
n of the problem (£)), but not on any condition numbers or any other 
parts of the data of the problem. In this respect, the theoretical analysis 
is relevant for practical applications. 

The above barrier method is not suitable for practical implementa- 
tions. The following simple acceleration scheme is essential for obtaining 
a more practical algorithm: Observe that the points y(/i) that are ap- 
proximated at each iteration of the barrier method satisfy 

-- + V$(y(M))=0. 

/i 

Differentiating this equation with respect to /i yields 

^ + V^$(y(^))y(/x) = 0. 
r 

For given values of /i and y(/i) this is a linear equation that can be 
solved for y{n). (The matrix is the same as the one that is used in the 
Newton step for finding y{ij).) Given this observation we can state a 
more efficient predictor corrector method. 

Dual predictor corrector method 

Start: Find y^^^ and /io > 0 with « y(/^o)- 
For k = 2, 3, ... 

■ Choose A/ifc e (0,/ifc-i) such that - ^ni-yink-i) 

satisfies C — A*{y^^^) ^ 0. 
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■ Set fik = f^k-i ~ and find y{l^k) by Newton’s method 

with line search starting at 

It turns out that can be computed fairly accurately even 

if only an approximate point ^ y(/i^_i) is known. For details 

see e.g. Jarre and Saunders (1993). This predictor corrector method is 
“reasonably efficient” , but primal-dual approaches are more efficient in 
general. 

We will generalize this method to nonlinear semidefinite programs in 
the next section 

3, Nonlinear Semidefinite Programs 

In this section we consider nonlinear semidefinite programs of the form 

maximize b^y s.t. A{y) y 0, fi{y) <0 for 1 < i < m, (5) 

where A : IR^ — )► 5^ is a smooth map and fi : IR are smooth func- 

tions. Note a slight change of notation, in this chapter ^ is a nonlinear 
operator, A : IR^ -> SK 

We define a (possibly nonconvex) barrier function $, 

m 

$(y) = -logdet(^(y)) - ^ log(-/j(y)) 

i—1 



and local minimizers 



2 /(/i) = local minimizer of 






+ $(y). 



( 6 ) 



In slight abuse of notation we will denote any local minimizer by y(/i); 
this definition therefore does not characterize y(/i) uniquely. 

Replacing $ with $, both, the barrier method and the predictor cor- 
rector method of Chapter 2.4 can also be applied to solve problem (5). 

There are two questions regarding the efficiency of the predictor cor- 
rector method for solving (5). (The barrier method is certainly unprac- 
tical!) 

■ Does y = lim)^_^oo exist, and if so, is y a “good” locally optimal 
solution of (5)? 

■ How quickly can be computed? 



3.1 Issues of global convergence 

As to the first question, one can show (see e.g. Jarre (2001)) that 
any accumulation point y of the sequence y^^^ satisfies the Fritz-John 
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condition, (for a definition see e.g. Borgwardt (2001)) 

m 

3u > 0, a ^ 0 : -u^h + ^ UiV fi{y) + Wm+i V det(^(y)) = 0. 

i=l 

While this condition is reasonable in the absence of a constraint qual- 
ification it is not suitable for semidefinite programs. Indeed, when- 
ever A{y) has the eigenvalue zero of multiplicity more than one, then 
V det (A{y)) — 0, so that one can choose Um+i — 1 and Ui = 0 for all 
other i. 

A more appropriate convergence result therefore is 

^Ay : Ay > 0, 

Vfi{y)Ay < 0 for all y with fi{y) = 0 
Ai{y) + eDA{y)[^y] >- 0 for small e > 0. 

This result states that there does not exist any direction Ay starting at 
y that is strictly linearized feasible and does not increase the objective 
function. 

Neither of the statements guarantees that y is a local minimizer. In- 
deed there are simple degenerate examples for which y is the global max- 
imizer of (5). As shown in Jongen and Ruiz Jhones (1999), for nonlinear 
programs satisfying an LICQ condition and not containing “degenerate” 
critical points, the limit point of y^^^ is a local minimizer. For such prob- 
lems one can still construct examples, such that y is a very “poor” local 
minimizer. Nevertheless we believe that in many cases y is a minimizer 
whose objective value is “close” to the global minimum of (5). This 
intuition is motivated by the work Nesterov (1997). Nesterov consid- 
ered the problem of minimizing a quadratic function over the oo-norm 
unit cube. This problem may have very poor local minimizers (whose 
objective value is much closer to the global maximum value than it is 
to the global minimum). Nesterov shows that any local minimizer over 
a y-norm cube with a suitable value of y = O(logn) has much better 
global properties in the sense that it is at least as good as the result 
guaranteed by the semidefinite relaxation. Intuitively, this result is due 
to the fact that the y-norm cube “rounds” the vertices and edges of the 
oo-norm cube. By this rounding procedure, the poor local minimizers 
disappear. In two dimensions the level sets of the logarithmic barrier 
function are almost indistinguishable from suitably scaled y-norm cubes. 
This leads us to believe that at least for quadratic minimization prob- 
lems over the oo-norm unit cube, a suitably implemented barrier method 
will also generate “good” local minimizers. 
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3.2 Efficiency of local minimization 

Note that by definition, y(iik) is a local minimizer of (6), and hence, 
^ 0. In all of our test problems the iterates ^ yil^k) 
satisfied the stronger condition 0. If this relation is satis- 

fied the extrapolation step for computing y^^^ in the predictor corrector 
method can be carried out in the same way as in the convex case. 

However, the iterates “on the way” from y^^^ to y^^^ often do not 
satisfy >- 0. This implies that the concept of self-concordance 

that formed the basis of the dual barrier method and of the predictor 
corrector method for solving {D) is no longer applicable. While it is not 
yet possible to generalize the theory of self-concordance to nonconvex 
functions, it seems possible that the known Lipschitz continuity proper- 
ties of carry over in some form to The tool that was used for 

minimizing the barrier function involving $ in Section 2.2 is Newton’s 
method. When ^ 0, Newton’s method with line search for 

approximating y{iJ>k) is no longer applicable. 

We need to find a suitable generalization of Newton’s method to the 
nonconvex case involving $. For this generalization we keep the fol- 
lowing properties in mind: The barrier subproblems that need to be 
solved at each step of the barrier method (or of the predictor correc- 
tor method) are systematically ill-conditioned. The condition number 
typically is 0(l//i), and the constant in the order notation is typically 
large. In addition, the computation of the Hessian matrices often is very 
expensive. 

Possible minimization methods for approximating y{y>k) include trust 
region methods with quasi-Newton updates of an approximate Hessian, 
see e.g. Conn et.al. (2000), continuation methods, or expensive plane 
search strategies as proposed in Jarre (2001). 

In numerical examples it turned out that the minimization problems 
tend to be quite difficult and none of the minimization methods con- 
verge quickly. In particular, the barrier subproblems appear to be sub- 
stantially more difficult to solve than in the convex case. We therefore 
address the complexity of smooth nonconvex local minimization. The 
next section shows that local minimization is AfV-ha^rd in a certain sense. 

3.3 Returning to the max cut problem 

We return to the example in Chapter 2.2. As shown in Nesterov 
(1998) an inner approximation for the polyhedron A4C is given by 

^^A=(x £ SVV I sin 






^ 0 
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Here, the square brackets sin [f X] are used to indicate that the sin 
function is applied componentwise to each of the matrix entries of |X. 
The set J\fA is formed from SW using the function c : [—1,1] [“I5 1] 

with c{t) = This function is a nonlinear “contraction” in the 

sense that \c{t)\ < |t|. 

It is somewhat surprising to find out that conv(A04) = MC^ i.e. 

AfA C conviAfA) = A4Cc SVV. 
see Nesterov (1998). 

A simple picture can explain the relationship of AfC, SVV ^ and AfA. 

The set AdC is a polytope whose precise description is not known in 
spite of its simple structure. (More precisely, there does not exist any 
known polynomial time algorithm which, given a point AT, either returns 
a certificate proving that X G AdC or returns a separating hyperplane.) 

The set SVV is obtained by “infiating” the set AdC while keeping all 
faces of dimension < n — 2 fixed. Like a balloon we “pump up” the hull 
of MC while keeping certain low-dimensional boundary manifolds fixed. 
(Note that AdC has dimension n(n — 1)/2.) The set SVV is convex and is 
“efficiently representable”, i.e. there exist efficient numerical algorithms 
for minimizing convex functions over SVV. 

The set AfA is obtained by shrinking SVV in a certain nonlinear 
fashion. This shrinkage is done in a certain optimal way such that all 
boundary manifolds of dimensions 1 and 2 of AdC are contained in AfA. 
In particular, for n = 3 we have AdC — A/14, see Hirschfeld and Jarre 
(2001). 

The set AfA is bounded by two smooth constraints, is star shaped, 
contains a ball of radius 1, and is contained in a ball of radius n. By our 
previous considerations, 
any locally optimal vertex of 

minimize C • X s.t. AT G AfA (7) 

solves the max cut problem (3). 

Hence, in spite of the nice properties of A/14, it must be very difficult 
to find a local optimal vertex of (7) or to check whether a given vertex 
is a local minimum. 

Note that (7) is a nonlinear semidefinite program. The difficulty of 
the local minimization of (7) is due to the fact that problem (7) suffers 
from a systematic violation of any constraint qualification. It contains 
many “peaks” similar to the one in 

G IR^ I a; > 0, . 

In higher dimensions such peaks become untractable. 
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3,4 Finding an e-KKT-point 

In a second example, see Hirschfeld and Jarre (2001), the so-called 
chained Rosenbrock function / : M 

f{x) = (xi - if + 100 '^{Xi - x1_ff 

i=2 

(see also Toint (1978)) has been tested. This function has only one 
local minimizer which is also the global minimizer, x = (1,...,!)^. 
Applying various trust region methods for minimizing / starting at 
= (—1,1,...,!)^ results in running times that appear to be ex- 
ponential in n. (These running times are purely experimental, and due 
to time limitations could only be tested for small values of n.) 

At first sight this result seems to contradict a statement by Vavasis. 
In the paper Vavasis (1993) the following result is shown. 

Consider the problem 

minimize f{x) s.t. — 1 < < 1 for 1 < i < n. (8) 

Vavasis assumes that the gradient V/ is Lipschitz continuous with Lip- 
schitz constant M and considers the problem of finding an e-KKT point 
for (8). He presents an algorithm that takes at most O(^) gradient 
evaluations to find an e-KKT point. This bound is exponential with re- 
spect to the number of digits of the required accuracy, i.e. with respect 
to loge”, but linear with respect to n. 

He also presents a class of functions of two variables for which any 

algorithm has a worst case complexity of at least gradient eval- 

uations to find an e-KKT point. 

The conditions of Vavasis’ paper apply to the Rosenbrock example as 
well. All points at which this function is evaluated by the trust region 
algorithms lie in the box —1 < X{ < 1, and moreover, Rosenbrocks 
function possesses moderately bounded norms of V^/ at these points 
implying that M is consistently small. The reason for the observed 
exponential growth of the number of iterations lies in the fact that the 
norms of the gradients do become small very quickly (as predicted by 
Vavasis even for a steepest descent method), but for large n, the norm 
of V/ needs to be extremely small to guarantee that the iterate is close 
to a local minimizer. Thus the exponential growth with respect to the 
number of variables is due to the fact that the e-KKT condition is a 
poor condition for large n. (We don’t know of any better condition 
though!) More results on local minimization issues are discussed in the 
forthcoming paper Hirschfeld and Jarre (2001). 
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4. Conclusion 

We have highlighted some issues of nonlinear semidefinite program- 
ming related to a dual barrier method. In particular we have raised 
the questions of smoothness, regularity, and computational complexity 
related to semidefinite programs. As preliminary numerical results in 
Jarre (2001) indicate, variants of the predictor corrector method of the 
present paper are reasonably fast for medium size problems (up to 500 
unknowns). The numerical results were also compared with the ones 
in Fukuda and Kojima (2001). In all examples it turned out that the 
method proposed in this paper converged to the global minimizer. This 
gives some further weak evidence that the method is indeed unlikely to 
be “trapped” near poor local minimizers. We also indicated that the 
local convergence of solving the barrier subproblems in the predictor 
corrector method is slow; improvements of this convergence behavior 
are the subject of future research. 
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Abstract In this paper we motivate and analyze a version of the implicit filtering 
algorithm by viewing it as an extension of coordinate search. We then 
show how implicit filtering can be combined with the damped Gauss- 
Newton method to solve noisy nonlinear least squares problems. 

Keywords: noisy optimization, implicit filtering, damped Gauss-Newton iteration, 
nonlinear least squares problems 

1. Introduction 

The purposes of this paper are to show how a version of the implicit 
filtering algorithm [24, 17, 16] can be motivated and analyzed by viewing 
it as an elaboration of coordinate search, and to describe and analyze 
a implicit filtering Gauss-Newton method for nonlinear least squares 
problems. 

Our approach to nonlinear least squares problems is based on a finite- 
difference form of the damped Gauss-Newton method [11, 24, 32], but 
differs from that in the MINPACK [30] routine Imdif .f. That code 
uses forward difference Jacobians with a user-defined difference incre- 
ment, but that increment is set only once. Implicit filtering uses a cen- 
tral difference not only to compute more accurate Jacobians, but more 
importantly to avoid local minima and to decide when to reduce the 
difference increment. 

Implicit filtering, which we describe in § 2, is a deterministic stencil- 
based sampling method. In general terms, implicit filtering is a finite- 
difference quasi-Newton method in which the size of the difference stencil 
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decreases as the optimization progresses. In this way one hopes to “fil- 
ter” low-amplitude, high-frequency noise in the objective function. 

Sampling methods do not use derivatives, but rather sample the ob- 
jective function on a stencil or pattern to determine the progress of the 
iteration and whether or not to change the size, but not the shape, of 
the stencil. Many of these methods, like implicit filtering, the Hooke- 
Jeeves [20] method, and multidirectional search [38, 39], reduce the size 
of the stencil in the course of the optimization. The stencil-size reduction 
policy leads to a convergence theory [24, 5, 39]. 

The best-known sampling method is the Nelder-Mead [31] algorithm. 
This method uses an irregular pattern that changes as the optimization 
progresses, and hence is not stencil-based in the sense of this paper. 
Analytical results for the Nelder-Mead algorithm are limited [24, 5, 26]. 
Theoretical developments are at also a very early stage for more aggres- 
sive sampling methods, like the DIRECT [22] algorithm, [14, 15]. 

Sampling methods, for the most part, need many iterations to obtain 
a high-precision result. Therefore, when gradient information is avail- 
able and the optimization landscape is relatively smooth, conventional 
gradient-based algorithms usually perform far better. Sampling meth- 
ods do well for problems with complex optimization landscapes like the 
ones in Figure 1, where nonsmoothness and nonconvexity can defeat 
most gradient based methods. 




Figure 1. Optimization Landscapes 



We caution the reader that sampling methods are not designed to be 
true global optimization algorithms. Problems with violently oscillatory 
optimization landscapes are candidates for genetic algorithms [19, 35], 
simulated annealing [25, 41], or the DIRECT algorithm [22, 21]. 

The paper is organized as follows. In § 2 we briefly describe the im- 
plicit filtering method and some of the convergence results. We describe 
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the new algorithm in § 3 and prove a local convergence result. In § 4 we 
illustrate the ideas with a parameter identification problem. 

2. Implicit Filtering 

In this section we introduce implicit filtering. We show how the 
method can be viewed as an enhanced form of a simple coordinate search 
method. Convergence analysis for methods of this type is typically done 
in a setting far simpler than one sees in practice. Many results require 
smooth objective functions [28, 26, 12, 39, 8, 9] or objective functions 
that are small perturbations of smooth functions [29, 17, 23, 5, 24, 7, 44]. 
The main results in this paper make the latter assumption. We will also 
assume that the noise decays near an optimal point. Such decay has 
been observed in practice [36, 10, 42, 43, 37, 4] and methods designed 
with this decay in mind can perform well even when the noise does not 
decay to zero as optimality is approached. 

2.1 Coordinate Search 

We begin with a discussion of a coordinate search algorithm, the sim- 
plest of all sampling methods, and consider the unconstrained problem 

min fix). (1) 

xeR^ 



From a current point Xc and stencil radius or scale he we sample / 
at the 2N points 

S{Xc, he) = {Xe ± heCj}, ( 2 ) 

where ej is the unit vector in the jth coordinate direction. Then either 
Xe or he is changed. 



■ If 



f{Xc) < 



min f{x) 

X^S {Xq )^c) 



then we replace he by /z+ = he/2 and set x-\- = Xe- 



(3) 



■ Otherwise, we replace Xe by any point in x.^ E S such that 



f(x+)= min fix) 

x^S(x,h) 



and let = he. 

We refer to (3) as stencil failure. If / is Lipschitz continuously 
differentiable, then [24, 5] stencil failure implies that 



||V/(0;e)||=O(M- 



(4) 
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Now, if / has bounded level sets, h will be reduced infinitely many times 
because there are only finitely many points on the grid with function 
values smaller than f{xc) [38]. Hence, by (4), the gradient of / will be 
driven to zero, giving subsequential convergence to a point that satisfies 
the necessary conditions for optimality. 

One-sided stencils [24, 36] and more general stencils with < 2N di- 
rections have also been used [1, 2, 27] and have similar theoretical prop- 
erties. Our experience has been that a full centered-difference stencil is 
better in practice. 

Sampling methods do more than solve smooth problems. Consider an 
objective which is the sum of a smooth function fx and a non-smooth 
function 0, which we will refer to as the noise. 

f{x) = fs{x) + (j){x) (5) 

We assume that (j) is uniformly bounded and small relative to /s, but 
make no smoothness or even continuity assumptions beyond that. High- 
frequency oscillations in (f) could result in local minima of / which would 
trap a conventional gradient-based method far from a minimizer of /g. If 
(j) decays sufficiently rapidly near a minimizer of /, then the coordinate 
search method responds to /g and, in a sense, “does not see” (p. 

To quantify the claim above, we return to the concept of stencil failure. 
Define 

U\\s{x,h)= max |(?i(a;)|. 

Z ^ O ^ CC ^ tl/ J 

If (3) holds and / satisfies (5), then [24, 5] 

\\^fsixc)\\ =o(^hc + . ( 6 ) 

Now, let {xn] be the sequence of coordinate search iterations and {hn} 
be the sequence of stencil radii, which we will refer to as scales. If / has 
bounded level sets, then the set of possible iterations for a given scale h 
is finite, as they lie on a grid [39], hence hn 0. If, moreover, the noise 
decays rapidly enough so that 



lim = 0, (7) 

n^oo hn 

then Vfs{xn) 0, by (6). 

This asymptotic result does not address an important practical issue. 
The number of times that h will be reduced during the optimization 
needs to be specified when the optimization begins or a limit on the 
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number of calls to / must be imposed. Most implementations of sam- 
pling methods use one or both of these as termination criteria. 

In the simple case where fs is a convex quadratic, for example, co- 
ordinate search, therefore, “jumps over” oscillations in (j) early in the 
iteration, when h is large, and, after finding a neighborhood of the min- 
imizer, increases the resolution {i. e. decreases the scale) and converges. 

2.2 Implicit Filtering 

The version of implicit filtering which we discuss in this paper acceler- 
ates coordinate search with a quasi-Newton method. We use the sample 
values to construct a centered difference gradient Vhf{xc)- We then try 
to take a quasi-Newton step 

x+^Xc- H~^VhJ{xc) ( 8 ) 

where He is a quasi-Newton model Hessian. We find that the BFGS 
[6, 18, 13, 34] works well for unconstrained problems. We reduce the 
scale when either the norm of the difference gradient is sufficiently small 
or stencil failure occurs. 

We formally describe implicit filtering below as a sequence of calls 
to a finite-difference quasi-Newton algorithm (fdquasi) followed by a 
reduction in the difference increment. The quasi-Newton iteration is 
terminated on entry if stencil failure is detected. The other termination 
criteria of the quasi-Newton iteration reflect the truncation error in the 
difference gradient. The tolerance for the gradient 

\\^hf{x)\\<rh (9) 

is motivated both by the heuristic that the step should be at least of the 
same order as the scale, by the implication (6) of stencil failure, and by 
the error estimate [24] [24] 

l|V/.(x) - Vhf{x)\\ = o(^h‘^+ j . (10) 

The performance of implicit filtering can be sensitive to the choice of 
the parameter r if, as was the case for the earliest implementations of 
implicit filtering [36, 17, 10], the test for stencil failure is not incorporated 
into the algorithm. 

The line search is not guaranteed to succeed because the gradient is 
not exact, therefore we allow only a few reductions in the step length 
before exiting the quasi-Newton iteration. If the line search fails, then 
sufficient decrease condition 



f{xc - XVhJixc)) - fixe) < -aX\\'VhJ{xc)f 



( 11 ) 
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has been violated. Here, as is standard, [11, 24], a is a small parameter, 
typically 10“^. If both (9) and (11) fail, then one can show in some cases 
[17] that the noise is sufficiently larger that the scale to justify terminat- 
ing the entire optimization. This leads to the question of selection of the 
smallest scale, which is open. In some special cases, [17] failure of the 
line search can be related to the size of noise, motivating termination of 
the entire optimization because the assumption that ||(/)|| is much smaller 
than h is no longer valid. 



Algorithm 1 fdquasi(a;, /,pmaa;, r, /i, amax) 

while p < pmax and \\Vhf{x)\\ > rh do 
compute / and V/j/ 
if (2) holds then 

terminate and report stencil failure 
end if 

update the model Hessian H if appropriate; solve Hd = —'Vhf{x) 
use a backtracking line search, with at most amax backtracks, to 
find a step length A 

if amax backtracks have been taken then 
terminate and report line search failure 
end if 

X <- X + \d 
p e- p + 1 

end while 

if p > pmax report iteration count failure 



Implicit filtering is a sequence of calls to fdquasi with the difference 
increments or scales reduced after each return from fdquasi. 



Algorithm 2 imfilter (rr, /,pmarr, r, amarr) 

for A; = 0, ... do 

fdquasi (x, f^pmax^ r, /i^, amax) 

end for 



Our analysis of coordinate search depended on the fact that 

||v/,(a;„)|| + (12) 

when stencil failure occurred and that h was reduced when that hap- 
pened. Since stencil failure directly implies success, as do (6) and (9) 
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together, the convergence result for coordinate search will hold for im- 
plicit filtering provided the line search only fails finitely often and the 
quasi-Newton iteration terminates because of stencil failure or satisfac- 
tion of (9), We summarize these observations in a theorem from [24]. 

Theorem 1 Let f satisfy (5) and let V/s be Lipschitz continuous. Let 
hji 0 and let {xn} be the implicit filtering sequence. Assume that 
either (3) or (11) hold after each call to fdquasi (i. e. there is no line 
search failure or iteration count failure) for all but finitely many k. Then 

if 

+ h~^U\\s{x,hn)) =0 (^3) 

then any limit point of the sequence {xn} is a critical point of fg. 

Theorem 1 does not explain the performance of implicit filtering in 
practice. In fact, other methods, such as coordinate search, Hooke- 
Jeeves, and MDS, also satisfy the conclusion of Theorem 1 if (13) holds, 
[24, 40]. Implicit filtering performs well only if a quasi-Newton model 
Hessian is used. The reasons for the efficacy of the quasi-Newton meth- 
ods are not fully understood. A step toward such an understanding is 
in [7], where a superlinear convergence result is presented. That result 
is somewhat like the one we give in § 3 and we will summarize it here. 

Assumptions on the rate of decrease of {hn} and of the size of (j) must 
be made to prove convergence rates. Landscapes like those in Figure 1 
motivated the qualitative decay assumption (13). To obtain superlinear 
convergence one must ask for much more and demand that h and (j) 
satisfy 

\\VHm\\ = o{\\x-x*f+n ( 14 ) 

for some p > 0. Here a:* is a local minimizer of fg. Satisfaction of (14) 
is possible in practice if both (j) and the scales h decrease near x*. As 
an example, suppose that fg has a local minimizer x*^ ^‘^fs is Lipschitz 
continuous in a neighborhood of a;*, W^^fsix*) is positive definite, and 
for X sufficiently near a;*, 

|«x)|=0(||x-i'f+2P), (15) 

for some p > 0. In that case, if one sets 

hn+l = l|V;.„/(x„+i)||l+^ (16) 

and other technical assumptions hold, then one can show that the im- 
plicit filtering iteration, with the BFGS update, is locally superlinearly 
convergent to x*. 
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3. Gauss-Newton Iteration and Implicit 
Filtering 

For the remainder of this paper we focus on nonlinear least squares 
objective functions 

1 ^ 1 

/(^) = 2 ^ (17) 

Z=1 

We assume that 

R{x) =Rs{x)^^{x) (18) 

where Rg : R^ -> R^ is Lipschitz continuously differentiable. Here the 
noise $ in the residual does not correspond to noise in any data in the 
problem, but rather noise in the computation of R. As an example, if 
one is doing a nonlinear fit to data, R might have the form R = M{x) — d^ 
where d is a vector of data and x are the model parameters. The noise 
we have in mind is in the computation of M, not in d. 

The noise $ in J? can be related to the noise 0 in / by 

(f){x) = R{x)^^{x) + ^{x)^^{x)/2. (19) 



3.1 Implicit Filtering Gauss-Newton (IFGN) 
Algorithm 

Our implementation of implicit filtering for nonlinear least squares 
differs from the one described in § 2 in two ways: 

■ The Jacobian of the residual, not the gradient of the objective 
function, is approximated by finite differences. 

■ The Gauss-Newton model Hessian is used instead of a quasi-New- 
ton model Hessian. 

We let VhR{x) be the centered difference gradient of R based on 
the stencil S{x^h). Our finite difference Gauss-Newton iteration Algo- 
rithm fdgauss, must be prepared for stencil failure and failure of the 
line search. The sufficient decrease condition is now 

f{xc - Ad) - f{xc) < -aX{{VhR{xc)f R{xc)fd. (20) 



where 

d = -{V hR{xc)^ V hR{xc))~^V hR{xcf R{xc) 
is the IFGN direction. 
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Algorithm 3 fdgsiuss{x^ R^pmax^r^h,amax) 

P= 1 

while p < pmax and ||(V/ii?(rr))^i?(a;)|| > rh do 
compute / = R{x)'^ R{x) /2 and V/^i? 
if (2) holds then 

terminate and report stencil failure 
end if 

set H = {VhR{x))^{VhR{x))] solve Hd - -VhR{xfR{x). 

use a backtracking line search, with at most amax backtracks, to 

find a step length A 

if amax backtracks have been taken then 
terminate and report line search failure 
end if 
X ^ X + Xd 
p <- p + 1 

end while 

if p > pmax report iteration count failure 



The implicit filtering form of the damped Gauss-Newton method, (Al- 
gorithm IFGN) calls fdgauss repeatedly, reducing the scale with each 
iteration. 



Algorithm 4 IFGN(a;, i?,pmarr, r, amaa;) 
for A: — 0, ... do 

f dgauss(a:, R^pmax^ r, amax) 

end for 



3.2 Convergence Analysis 

We will make a distinction between the central difference gradient of 
/ R?^R/2 and the difference gradient computed via (VhR)^R^ since 

the two approximate gradients have different errors, especially in the 
small residual case. 

For any function : R^ R^ (here L = 1 or L = M), define 

M\s{x,h)= mp ||V’(3;)||. 
z£S{x,h) 



and 



E{x^ /i, 'i/j) = + 






h 
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We can rewrite (10) as 

\\Vfs{x) - Vhf{x)\\ = 0{E{x,h,cl>)). (21) 

Lemma 3.1 gives the analog of (21) for nonlinear least squares problems 
in (24) and refines (21) in (23). The error in {V hR{x))'^ R{x) is scaled 
by the residual norm, a fact we exploit for zero residual problems in 
Lemma 3.3. 



Lemma 3.1 Let R be given by (18). Assume that there is K > 0 such 
that 



||$||s(.,^)<if||i?.(rr)||. ( 22 ) 



Then 

||V/,(rc) - Vhf{x)\\ =o(h^+ 



(23) 



||V/,(a;) - {VhR{x)fR{x)\\ = 0{\\Rsix)\\E{x,h,^)), (24) 

and 

\\R',{xf R',{x) - {VhR{x)fVHR{x)\\ = 0{E{x,K^)). (25) 

The constants in the 0-terms depend on the norm and the Lipschitz 
constant of R'. 

Proof. The estimate (23) follows from (10) and (19). 

We now prove (24). By definition, 

{VhR{x)fR{x) = (VhiRsix) + ^xWiRsix) + $(x)) 



^{VhRs{x)VRs{x) + 0 



||i?(x)|| 






= Vfs{x) + 0 






||$||2 



S(x,h) 

h 



= Vfsix) + Oi\\Rs{x)\\E{x,h,^)). 



as asserted. 

The proof of (25) is similar. □ 

Lemma 3.1 leads directly to a simple convergence result, which, for 
zero residual problems with only a few stencil failures, requires only that 
E{xn^ hn-i ^) be bounded, a weaker condition than (7). 
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Theorem 2 Let R satisfy (18) and assume that R' is Lipschitz continu- 
ous. Let hn Q and let {xn} he the implicit filtering sequence. Assume 
that all but finitely many calls to fdgauss return with stencil failure or 

i?(x„)|| < Thn, (26) 

that the model Hessians R{xn)^ R{xn) are nonsingular, and that the 
model Hessians and their inverses are uniformly bounded. Then if 

lim E{xn, hn, $) = 0 (27) 

n-^oo 

then any limit point of x^ is a critical point of f. If, moreover, all hut 
finitely many calls to fdgauss return with (26); then (27) can he replaced 
by 

lim /in, ^) = 0 (28) 

n-^oo 

Proof. The convergence assumption (27) requires that 

\\^\\s{xn,hn)/^ri 0 . 

In view of (19), this is equivalent to (7) if (22) holds. Hence the first 
assertion of the theorem is simply a restatement of Theorem 1. 

If the finite-difference Gauss-Newton iteration terminates all but 
finitely many times with (26), then 

\\Vfs{Xn)\\ < rhn + 0{\\Rs{Xn)\\E{Xn,hn,^)) 

by (24). This completes the proof. □ 

3.3 Local Convergence 

To analyze the local convergence behavior of the IFGN iteration, we 
must assume that the model Hessians are well conditioned and bounded. 
Let X* be a local minimizer of fs{x) = RJ {x)Rs{x) for which the stan- 
dard assumptions for convergence of the Gauss-Newton iteration 

= Xc- {{V hRs{Xc))'^VhRs{Xc))~^ R's{Xc)Rs(Xc), 

hold (smoothness, nonsingularity of the model Hessian, sufficiently small 
residual) . 

To quantify this we will assume: 

Assumption 3.1 There is po > 0 such that 

■ Rs is Lipschitz continuously differentiable in the set 
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■ the Gauss-Newton model Hessian Rg{x)'^ R'g{x) and its inverse are 
uniformly bounded in V, and 

■ there are rcN ^ (0, 1) and Cqn > 1 such that for all Xc G V, 

We^^W < CaN{\\ec\? + ||i?.(a;*)||||ec||) < roNWed (29) 



As is standard, we let e = rr — a;* for x G with the iteration index 
for e being inherited from the one for x. 

Lemma 3.2 Let R be given by (18). Let (22) and Assumption 3.1 hold 
and let Xc G V. Then if 

sup E{x^ /i, $) 
xev 

is sufficiently small, the IFGN model Hessian {V hR{xc))'^'^ hR{^c) is 
nonsingular. Moreover, if 

X+=^Xc- {{V hR{Xc)dV hR{Xc)d'^ hR{XcV R{Xc) 

then 

||e+|| = \\e^^\\ + Oi\\Rs{xe)\\E{xc,hd))- (30) 

Proof. Let Xc G V. Assumption 3.1 and (25) imply that 

\\iR',{xcfR',{xc)d - {iVhR{xc)fVhR{xc)d\\ = 0{E{xc,hd))- 

(31) 



Now, 

= x^^ + EnVfsixc) + {R'{xcf R'{xc))-^Eg 

where 

Eh = {R'{xcfR'{xc))-^ - {iVhR{xc)fVhR{xc)d 

and 

Eg = Vfs(x) - {VhR{x)fR{x). 

Since Vfs{xc) = 0(||ec||), we apply (31) to obtain 

EffVfsixc) = 0{\\Rs{xc)\\E{xc,h,^)). 

The conclusion now follows from (22) and (24). □ 

Theorem 3 Let R be given by (18). Let (22) and Assumption 3.1 hold. 
Let xq E V. Let hn ^ 0. Assume that the implicit filtering sequence 
{xn} C V and that the line search fails only finitely many times. Then 
if (27) holds then Xn x"" . 




Implicit Filtering 83 

3.4 Rates of Convergence 

To obtain rates of convergence we must make stronger assumptions 
on on the scales, and on the convergence rates of the Gauss-Newton 
iteration for the smooth problem. We must augment (29) with a lower 
bound that states that the Gauss-Newton iteration for Rs converges no 
faster than the standard Gauss-Newton convergence rate. This latter 
assumption is a nondegeneracy condition on i?" and is needed for the 
superlinear convergence results. 

Assumption 3.2 There are p G (0, 1] and Cp > 0 such that 

l|4(lc)ll < Cp||ec||"+"'- (S2) 

for all Xc E V. In addition to (29), 

CG^(llec|P + lli?.(:r*)ll||ee||)<||e«^|| (S3) 



for all Xc E V. 

Lemma 3.3 Let Assumptions 3.1 and 3.2 hold. Then if Xc is suffi- 
ciently near x* and 

C';7'||ee||'+^ <hc< (34) 

then there are < r < 1 and C > 1 such that 

C'“’'l|e+^|| < ||e+|| < C'lle^^ll < r||ec||, (35) 



Proof. We will show that 

||i?,(a;c)||^(xc,/ic,$)=o(||e?^||) (36) 

for Xc near x*. The result will follow from Lemma 3.2 for Xc sufficiently 
near x*. 

Lemma 3.3 and (32) imply that 

E{xc,h,^) = Oi\\ec\\^+n- 

We consider two cases. If the smooth problem is a zero residual prob- 
lem (Rs{x*) = 0), then 

\\Rsixc)\\E{xc,hc,^) = 0{\\ecf+P)- 



In this case, (33) implies (36). 
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If Rs{x*) 0, then 

\\Rs{xc)\\E{x,,K,^) = 0{\\ecf^P)- 

However, in that case (33) implies that 

||e+^|| > C'g^||-R,(a;*)||||ec|| 

and (36) holds. This completes the proof. □ 

In order to apply Lemma 3.3 we need to make sure that (34) holds 
throughout the iteration. The most direct way to do this is to update 
hn with an analog of (16) 

K+l = (37) 



Theorem 4 Let Assumptions 3.1 and 3.2 hold. The if xq is sufficiently 
neav x^ 

l|V/,(a:o)ir72 < ho < 2||V/,(:z;o)||('+^^/^ (38) 

and the implicit filtering sequence is defined by Algorithm IFGN and (37); 
then Xji x"" and 

II < l|e„+ill < <7||e™ II < r||e„||, (S9) 

for all n > 0. 

Proof. Our assumptions imply that (38) is equivalent to (34) with, 
for example, 

Ch = sup ||V^/x( 2 ;)||. 
xev 

Hence, proceeding by induction, we need only show that 

l|V/s(a:„)|r72 <hn< 2\\Vfs{xn)\\^^+Py^ (40) 



for n > 0. 

By (24), if hn satisfies (40), then 

hn+i = (||V/,(a:„+i)|| + ||i?,(a;„+i)||£;(a;„+x,h„,$))'+^ 

= {\\Vfs{xn+i)\\+o{\\e^^,\\)y^^ 

= \\Vfs{Xn+lW+P + o(||Vi?,(x„+i)||l+^’). 

Hence /in+i satisfies (40) for xq sufficiently near x*. □ 
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Remark: Theorem 4 says that the local convergence of IFGN is 
as good asymptotically as Gauss-Newton, if one counts only nonlinear 
iterations. For zero residual problems, one need not reduce the scales as 
rapidly. If we replace (34) by 

(41) 



then (35) becomes 

||e+ll < Clle^^ll + 0{\\Rs{xc)\\{hl + ||eeir+^’)). (42) 

This will imply superlinear convergence for zero residual problems for 
which (22) and (32) hold if hn -> 0. The computations in § 4 illustrate 
this. 

4. Numerical Example 

We report on the performance of IFGN on a parameter identification 
problem taken from [24, 7, 3]. Here N = 2 and M = 100. The problem 
is to identify the stiffness k and damping c in a harmonic oscillator so 
that the numerical solution of 



u” + cu' + ku 0'^ 1 /( 0 ) = uq^ '^^(0) = 0 



best fits the data in the least squares sense. 

For this example the data are values of the exact solution at U 
i/100 for 1 < i < 100. The numerical solution was computed with the 
MATLAB ODE15s integrator [33]. 

We compare three variations of implicit filtering, IFGN with a fixed 
sequence of scales and an adaptive sequence that attempts to satisfy 
(37), and a version of the implicit filtering/BFGS algorithm from [24, 7] 
that has been modified to use adaptive scales. In all three we limit 
the optimization to a budget of 100 calls to the function. This does 
not mean that an iteration is terminated before completion, rather we 
monitor the number of function evaluations after a call to the finite 
difference optimizer returns and stop the optimization if the number of 
function evaluations has exceeded the budget after the completion of the 
iteration. 

For all the computations the initial iterate is (c, A:) = (2,3). The 
sequence of scales used in the examples is 



/iW -2-", n = 4,...,13. 



(43) 
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Following [7], we implement adaptive scales based on a scaled and safe- 
guarded form of (37), 



h 



( 2 ) 

n+1 



= max 



min 



%+D 



||(V»„fl(xo)^iJ(xo)|| j 




(44) 

where p — 1/2 and hmin ~ 10“^. hmin is roughly the cube root of 
machine roundoff and is the optimal choice of h for a central difference. 

In the examples the line search strategy is to reduce the step by half 
if the sufficient decrease condition (either (11) for implicit filtering or 
(20) for IFGN) fails. Within both algorithms fdquasi and fdguass, 
amax = 10 and pmax = 100. 



tol=1.d-6 tol=1.d-6 







Figure 2. Parameter Identification Example 

In Figure 4 we plot the norm of the difference gradient and the size of 
the function for the three variations of implicit filtering and two values, 
10“^ and 10“^, of the tolerance given to ODE15s. One can see that the 
two variations of IFGN did substantially better than an implementation 
of implicit filtering that did not exploit the least squares structure. A 
more subtle difference, explained by the remark at the end of § 3, is 
that while the use adaptive scales made no visible difference in IFGN’s 
ability to reduce the residual (the curves overlap, indicating that the rate 
of convergence for both methods is equally fast, i. e. superlinear), it did 
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make the difference gradient a much better indicator of the progress of 
the optimization (the scales that are reduced most rapidly produce more 
accurate gradients). 

We see similar behavior for a small, but non-zero, residual problem. 
In Figure 2 we show the results from the parameter ID problem with 
uniformly distributed random numbers in the interval [0, 10“^] added to 
the data. The gradients behave in the same way as in the experiment 
with exact data, while the limiting function values reflect the non-zero 
residual in the high-accuracy simulation. In the low-accuracy simulation, 
the tolerances given to the integrator are smaller than the noise in the 
data, so the figures are almost identical to the one for the noise-free case. 







Figure 3. Parameter Identification Example; Random Noise in Data 
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Abstract Support vector machines (SVMs) have played a key role in broad classes 
of problems arising in various fields. Much more recently, SVMs have 
become the tool of choice for problems arising in data classification and 
mining. This paper emphasizes some recent developments that the au- 
thor and his colleagues have contributed to such as: generalized SVMs (a 
very general mathematical programming framework for SVMs), smooth 
SVMs (a smooth nonlinear equation representation of SVMs solvable 
by a fast Newton method), Lagrangian SVMs (an unconstrained La- 
grangian representation of SVMs leading to an extremely simple itera- 
tive scheme capable of solving classification problems with millions of 
points) and reduced SVMs (a rectangular kernel classifier that utilizes 
as little as 1% of the data). 



1. Introduction 

This paper describes four recent developments, one theoretical, three 
algorithmic, all centered on support vector machines (SVMs). SVMs 
have become the tool of choice for the fundamental classification problem 
of machine learning and data mining. We briefiy outline these four 
developments now. 

In Section 2 new formulations for SVMs are given as convex math- 
ematical programs which are often quadratic or linear programs. By 
setting apart the two functions of a support vector machine: separa- 
tion of points by a nonlinear surface in the original space of patterns, 
and maximizing the distance between separating planes in a higher di- 
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mensional space, we are able to define indefinite, possibly discontinuous, 
kernels, not necessarily inner product ones, that generate highly nonlin- 
ear separating surfaces. Maximizing the distance between the separating 
planes in the higher dimensional space is surrogated by support vector 
suppression, which is achieved by minimizing any desired norm of sup- 
port vector multipliers. The norm may be one induced by the separation 
kernel if it happens to be positive definite, or a Euclidean or a polyhe- 
dral norm. The latter norm leads to a linear program whereas the former 
norms lead to convex quadratic programs, all with an arbitrary separa- 
tion kernel. A standard support vector machine can be recovered by 
using the same kernel for separation and support vector suppression. 

In Section 3 we apply smoothing methods, extensively used for solv- 
ing important mathematical programming problems and applications, to 
generate and solve an unconstrained smooth reformulation of the support 
vector machine for pattern classification using a completely arbitrary 
kernel. We term such reformulation a smooth support vector machine 
(SSVM). A fast Newton- Armijo algorithm for solving the SSVM con- 
verges globally and quadrat ically. Numerical results and comparisons 
demonstrate the effectiveness and speed of the algorithm. For example, 
on six publicly available datasets, tenfold cross validation correctness of 
SSVM was the highest compared with four other methods as well as the 
fastest. 

In Section 4 an implicit Lagrangian for the dual of a simple refor- 
mulation of the standard quadratic program of a linear support vector 
machine is proposed. This leads to the minimization of an unconstrained 
differentiable convex function in a space of dimensionality equal to the 
number of classified points. This problem is solvable by an extremely 
simple linearly convergent Lagrangian support vector machine (LSVM) 
algorithm. LSVM requires the inversion at the outset of a single matrix 
of the order of the much smaller dimensionality of the original input 
space plus one. The full algorithm is given in this paper in 11 lines of 
MATLAB code without any special optimization tools such as linear or 
quadratic programming solvers. This LSVM code can be used “as is” to 
solve classification problems with millions of points. 

In Section 5 an algorithm is proposed which generates a nonlinear 
kernel-based separating surface that requires as little as 1% of a large 
dataset for its explicit evaluation. To generate this nonlinear surface, 
the entire dataset is used as a constraint in an optimization problem 
with very few variables corresponding to the 1% of the data kept. The 
remainder of the data can be thrown away after solving the optimiza- 
tion problem. This is achieved by making use of a rectangular m x fh 
kernel K{A,A') that greatly reduces the size of the quadratic program 
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to be solved and simplifies the characterization of the nonlinear sepa- 
rating surface. Here, the m rows of A represent the original m data 
points while the ffi rows of A represent a greatly reduced m data points. 
Computational results indicate that test set correctness for the reduced 
support vector machine (RSVM), with a nonlinear separating surface 
that depends on a small randomly selected portion of the dataset, is 
better than that of a conventional support vector machine (SVM) with 
a nonlinear surface that explicitly depends on the entire dataset, and 
much better than a conventional SVM using a small random sample of 
the data. Computational times, as well as memory usage, are much 
smaller for RSVM than that of a conventional SVM using the entire 
dataset. 

A word about our notation. All vectors will be column vectors unless 
transposed to a row vector by a prime superscript '. For a vector x 
in the n-dimensional real space the plus function xj^ is defined as 
{xjiAi = max {0,a;^}, i — l,...,n, while x^ denotes the step function 
defined as {xA)i — 1 if > 0 and {xAji = 0 if < 0, z = 1, . . . , n. The 
scalar (inner) product of two vectors x and y in the n-dimensional real 
space RA will be denoted by x'y and the p-norm of x will be denoted by 
\\x\\p. If x'y = 0, we than write x ± y. For a matrix A G A{ is the 

ith row of A which is a row vector in i?^. A column vector of ones of 
arbitrary dimension will be denoted by e. For A G R^^'^ and B G RP''^\ 
the kernel K{A^ B) maps R!^^^ x RP'^^ into R^^K In particular, if x and 
y are column vectors in R^ then, K{x'^ y) is a real number, K{x' ^ A') is a 
row vector in R^ and AT(A, A') is an m x m matrix. If / is a real valued 
function defined on the n-dimensional real space R^^ the gradient of / at 
X is denoted by V/(x) which is a row vector in RJ^ and the n x n Hessian 
matrix of second partial derivatives oi f dX x is denoted by V^/(rr). The 
base of the natural logarithm will be denoted by e. 

2. The Generalized Support Vector Machine 
(GSVM) [25] 

We consider the problem of classifying m points in the n-dimensional 
real space represented by the m x n matrix A, according to member- 
ship of each point Ai in the classes +1 or -1 as specified by a given mxm 
diagonal matrix D with ones or minus ones along its diagonal. For this 
problem the standard support vector machine with a linear kernel AA' 
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[38, 11] is given by the following for some v > 

min ve’y + hw'w 

s.t. D{Aw — e'j) +y > e 

y > 0. 



(1) 



Here w is the normal to the bounding planes: 

- 7 = +1 p, 

XW — ^ — — 1 , 

and 7 determines their location relative to the origin. The first plane 
above bounds the class +1 points and the second plane bounds the class 
A points when the two classes are strictly linearly separable, that is 
when the slack variable y — 0. The linear separating surface is the plane 



x'w = 7 , 



(3) 



midway between the bounding planes (2). See Figure 1. If the classes 
are linearly inseparable then the two planes bound the two classes with 
a “soft margin” determined by a nonnegative slack variable y, that is: 



x'w 

x'w 



7 + yi > +1, for x' = Ai and Du = +1, 
7 — yi ^ —Ij foi x' = Ai a,nd Du = —1. 



(4) 



The 1-norm of the slack variable y is minimized with weight u in (1). 
The quadratic term in (1), which is twice the reciprocal of the square of 
the 2-norm distance between the two bounding planes of (2) in the 
n-dimensional space oi w ^ EA for a fixed 7, maximizes that distance, 
often called the “margin”. Figure 1 depicts the points represented by A, 
the bounding planes (2) with margin and the separating plane (3) 
which separates A+, the points represented by rows of A with Du = +1, 
from A—^ the points represented by rows of A with Du = —1. 

In the GSVM formulation we attempt to discriminate between the 
classes +1 and -1 by a nonlinear separating surface which subsumes the 
linear separating surface (3), and is induced by some kernel K{A^A')^ 
as follows: 

K{x',A')Du^j, (5) 

where K(x'^A') G e.g. K{x',A') = x'A for the linear separating 
surface (3) and w — A' Du. The parameters u G and 7 G i? are 
determined by solving a mathematical program, typically quadratic or 
linear. In special cases, such as the standard SVM (13) below, u can be 
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x'w = 7+1 

X 




Figure 1. The bounding planes (2) with margin and the plane (3) separat- 

ing A4-, the points represented by rows of A with Du = +1, from A—, the points 
represented by rows of A with Du = — 1. 

interpreted as a dual variable. A point x G is classified in class +1 
or -1 according to whether the decision function 

(6) 

yields 1 or 0 respectively. Here (•)* denotes the step function defined 
in the Introduction. The kernel function K{x’^A^) implicitly defines 
a nonlinear map from x ^ to some other space z E where k 
may be much larger than n. In particular if the kernel K is an inner 
product kernel under Mercer’s condition [13, pp 138-140], [38, 11, 5] (an 
assumption that we will not make in this paper) then for x and y in RP^\ 

K{x,y) ^h{x)'h{y), (7) 

and the separating surface (5) becomes: 

h{x)'h{A')Du = 7, (8) 

where /i is a function, not easily computable, from R^ to i?^, and 
h{A') G R^^'^ results from applying h to the m columns of A' . The 
difficulty in computing h and the possible high dimensionality of R^ 
have been important factors in using a kernel K 3>s a. generator of an 




96 



implicit nonlinear separating surface in the original feature space but 
which is linear in the high dimensional space R^. Our separating surface 
(5) written in terms of a kernel function retains this advantage and is 
linear in its parameters, ^,7. We now state a mathematical program 
that generates such a surface for a general kernel K as follows: 

min ye'y + f{u) 

s.t. D{K{A^A’)Du — e^) y > e ( 9 ) 

2/ > 0. 

Here / is some convex function on typically some norm or semi- 
norm, and V is some positive parameter that weights the separation error 
e'y versus suppression of the separating surface parameter u. Suppres- 
sion of u can be interpreted in one of two ways. We interpret it here as 
minimizing the number of support vectors, i.e. constraints of (9) with 
positive multipliers. A more conventional interpretation is that of max- 
imizing some measure of the distance or margin between the bounding 
parallel planes in i?^, under appropriate assumptions, such as / being 
a quadratic function induced by a positive definite kernel AT as in (13) 
below. As is well known, this leads to improved generalization by mini- 
mizing an upper bound on the VC dimension [38, 35]. 

We term a solution of the mathematical program (9) and the resulting 
decision function (6) a generalized support vector machine^ GSVM. In 
what follows derive a number of special cases, including the standard 
support vector machine. 

We consider first support vector machines that include the standard 
ones [38, 11,5] and which are obtained by setting / of (9) to be a convex 
quadratic function f[u) = where H G R^^'^ is some symmetric 

positive definite matrix. The mathematical program (9) becomes the 
following convex quadratic program: 

min i/e'y + hu'Hu 

s.t. D{K{A, A')Du — ej) + y > e (10) 

y > 0. 

The Wolfe dual [39, 22] of this convex quadratic program is: 

mm ^r'DK{A, A')DH~^DK{A, AjDr - e'r 

s.t. e'Dr = 0 (H) 

0 < r < ue. 

Furthermore, the primal variable u is related to the dual variable r by: 

u = H~^DK{A,A'yDr, (12) 
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If we assume that the kernel K{A, A') is symmetric positive definite and 
let H — DK{A^A')D^ then our dual problem (11) degenerates to the 
dual problem of the standard support vector machine [38, 11, 5] with 
u = r: 

min ^u'DK(A^ A')Du — e'u 
ueR^ 

s.t. e'Du = 0 (1^) 

0 <u < ue. 

The positive definiteness assumption on K{A^ A') in (13) can be relaxed 
to positive semidefiniteness while maintaining the convex quadratic pro- 
gram (10), with H = DK{A^A')D^ as the direct dual of (13) without 
utilizing (11) and (12). The symmetry and positive semidefiniteness of 
the kernel K{A, A') for this version of a support vector machine is con- 
sistent with the support vector machine literature. The fact that r = u 
in the dual formulation (13), shows that the variable u appearing in the 
original formulation (10) is also the dual multiplier vector for the first set 
of constraints of (10). Hence the quadratic term in the objective function 
of (10) can be thought of as suppressing as many multipliers of support 
vectors as possible and thus minimizing the number of such support 
vectors. This is another (nonstandard) interpretation of the standard 
support vector machine that is usually interpreted as maximizing the 
margin or distance between parallel separating planes. 

This leads to the idea of using other values for the matrix H other 
than DK{A^A')D that will also suppress u. One particular choice is 
interesting because it puts no restrictions on K: no symmetry, no positive 
definiteness or semidefiniteness and not even continuity. This is the 
choice H = I in (10) which leads to a dual problem (11) with H = I 
and u — DK{A^A')'Dr as follows: 

min \r'DK{A, A')KiA, A')' Dr - e'r 

r^Rm 2. 

s.t. dDr — 0 

0 < r < ve. 

We note immediately that K{A^A')K{A^A'y is positive semidefinite 
with no assumptions on K{A^A')^ and hence the above problem is an 
always solvable convex quadratic program for any kernel K{A,A'). In 
fact by the Frank- Wolfe existence theorem [15], the quadratic program 
(10) is solvable for any symmetric positive definite matrix H because 
its objective function is bounded below by zero. Hence by quadratic 
programming duality its dual problem (11) is also solvable. Any solution 
of (10) can be used to generate a nonlinear decision function (6). Thus we 
are free to choose any symmetric positive definite matrix H to generate 
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a support vector machine. Experimentation will be needed to determine 
what are the most appropriate choices for H. 

By using the 1-norm instead of the 2-norm a linear programming 
formulation for the GSVM can be obtained. We refer the interested 
reader to [25]. 

We turn our attention now to an efficient method for generating SVMs 
based on smoothing ideas that have already been effectively used to solve 
various mathematical programs [7, 8, 6, 9, 10, 16, 37, 12]. 

3. SSVM: Smooth Support Vector Machines [21] 

In our smooth approach, the square of 2-norm of the slack variable 
y is minimized with weight | instead of the 1-norm of y as in (1). In 
addition the distance between the planes (2) is measured in the (n + 1)- 
dimensional space of (it;, 7) G that is Measuring the 

margin in this (n + l)-dimensional space instead of induces strong 
convexity and has little or no effect on the problem as was shown in 
[26, 27, 21, 20]. Thus using twice the reciprocal squared of this margin 
instead, yields our modified SVM problem as follows: 

min + \(w'w + 7^) 

s.t. D{Aw — e^)+y > e 

y > 0. 

At the solution of problem (15), y is given by 

y = {e- D{Aw - 67))+, 

where, as defined in the Introduction, (•)+ replaces negative components 
of a vector by zeros. Thus, we can replace y in (15) by {e — D{Aw — ej))A^ 
and convert the SVM problem (15) into an equivalent SVM which is an 
unconstrained optimization problem as follows: 

min me-D{Aw-ej))+\\l + ^{w'w + j^). (17) 

This problem is a strongly convex minimization problem without any 
constraints. It is easy to show that it has a unique solution. However, the 
objective function in (17) is not twice differentiable which precludes the 
use of a fast Newton method. We thus apply the smoothing techniques 
of [7, 8] and replace by a very accurate smooth approximation [21, 
Lemma 2.1] that is given by p{x^ a), the integral of the sigmoid function 
of neural networks [23], that is 

p{x^ a) = X + ~ log(l 4- a > 0. 

a 



(15) 

(16) 



( 18 ) 
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This p function with a smoothing parameter a is used here to replace 
the plus function of (17) to obtain a smooth support vector machine 
(SSVM): 






(19) 



It can be shown [21, Theorem 2.2] that the solution of problem (15) is 
obtained by solving problem (19) with a approaching infinity. Advan- 
tage can be taken of the twice differentiable property of the objective 
function of (19) to utilize a quadratically convergent algorithm for solv- 
ing the smooth support vector machine (19) as follows. 



Algorithm 3.1 Newton- Armijo Algorithm for SSVM (19) 

Start with any (tf;^,y^) G Having (t<;\ 7 ^), stop if the gradient 

of the objective function of (19) is zero, that is V$a(t(;% 7 ^) = 0. Else 
compute as follows: 

(i) Newton Direction: Determine direction (D G by setting 

equal to zero the linearization of V$a(t^;, 7 ) around which 

gives n + 1 linear equations in n + 1 variables: 

= (20) 



(ii) Armijo Stepsize [1]: Choose a stepsize Aj G i? such that: 

{w^+\f+^) = {w\Y) + Xid^ ( 21 ) 

where Xi — max{l, . .} such that : 

^a{w\Y) - ^a{{w\Y) + \d^) > -5XiV^a{w\Y)d^ ( 22 ) 

where 5 G (0, ^). 

Note that a key difference between our smoothing approach and that 
of the classical SVM [38, 11] is that we are solving here a linear system 
of equations (20) instead of solving a quadratic program as is the case 
with the classical SVM. Furthermore, it can be shown [21, Theorem 3.2] 
that the smoothing algorithm above converges quadratically from any 
starting point. 

To obtain a nonlinear SSVM we consider the GSVM formulation (9) 
with a 2-norm squared error term on y instead of the 1-norm, and instead 
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of the convex term f{u) that suppresses u we use a 2-norm squared of 
[^] to suppress both u and 7. We obtain then: 

min jy'y + ^{u'u + 7^) 

s.t. D{K{A,A')Du-ey)+y > e (23) 

y > 0. 

We repeat the same arguments as above, in going from (15) to (19), to 
obtain the SSVM with a nonlinear kernel K(A^A'): 

min ^\\p{e- D{K{A,A')Du-e'r),a)\\l + l{u'u + -f‘^), (24) 

where K(A^A') is a kernel map from x to We 

note that this problem, which is capable of generating highly nonlinear 
separating surfaces, still retains the strong convexity and differentiability 
properties for any arbitrary kernel. All of the convergence results for a 
linear kernel hold here for a nonlinear kernel [21]. 

The effectiveness and speed of the smooth support vector machine 
(SSVM) approach can be demonstrated by comparing it numerically 
with other methods. In order to evaluate how well each algorithm gen- 
eralizes to future data, tenfold cross-validation is performed on each 
dataset [36]. To evaluate the efficacy of SSVM, computational times of 
SSVM were compared with robust linear program (RLP) algorithm [2], 
the feature selection concave minimization (FSV) algorithm, the support 
vector machine using the 1-norm approach (SVM||.||J and the classical 
support vector machine (SVM||.||2) [3, 38, 11]. All tests were run on 
six publicly available datasets: the Wisconsin Prognostic Breast Can- 
cer Database [31] and four datasets from the Irvine Machine Learning 
Database Repository [34]. It turned out that tenfold testing correctness 
of the SSVM was the highest for these five methods on all datasets tested 
as well as the computational speed. Detailed numerical results are given 
in [21]. 

As a test of effectiveness of the SSVM in generating a highly nonlinear 
separating surface, we tested it on the 1000-point checkerboard dataset 
of [19] depicted in Figure 2. We used the following a Gaussian kernel in 
the SSVM formulation (24): 

Gaussian Kernel : = 1,2,3... m. 

The value of the parameter /i used as well as values of the parameters u 
and a of the nonlinear SSVM (24) are all given in Figure 3 which depicts 
the separation obtained. Note that the boundaries of the checkerboard 
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are as sharp as those of [26], obtained by a linear programming solu- 
tion, and considerably sharper than those of [19], obtained by a Newton 
approach applied to a quadratic programming formulation. 

We turn now to an extremely simple iterative algorithm for SVMs 
that requires neither a quadratic program nor a linear program to be 
solved. 



4. LSVM: Lagrangian Support Vector Machines [28] 

We propose here an algorithm based on an implicit Lagrangian of 
the dual of a simple reformulation of the standard quadratic program of 
a linear support vector machine. This leads to the minimization of an 
unconstrained differentiable convex function in a space of dimensionality 
equal to the number of classified points. This problem is solvable by an 
extremely simple linearly convergent Lagrangian support vector machine 
(LSVM) algorithm. LSVM requires the inversion at the outset of a single 
matrix of the order of the much smaller dimensionality of the original 
input space plus one. The full algorithm is given in this paper in 11 lines 
of MATLAB code without any special optimization tools such as linear 
or quadratic programming solvers. This LSVM code can be used “as is” 
to solve classification problems with millions of points. For example, 2 
million points in 10 dimensional input space were classified by a linear 
surface in 6.7 minutes on a 250-MHz UltraSPARC II [28]. 

The starting point for LSVM is the primal quadratic formulation (15) 
of the SVM problem. Taking the dual [24] of this problem gives: 

min {— 4- DiAA! + ee')D)u — e'u. (25) 

o<ueR^2 u 

The variables {w^ 7) of the primal problem which determine the separat- 
ing surface x'w = 7 are recovered directly from the solution of the dual 
(25) above by the relations: 



u 

w = A’ Du^ y — 7 = —e'Du. (26) 

We immediately note that the matrix appearing in the dual objective 
function is positive definite and that there is no equality constraint and 
no upper bound on the dual variable u. The only constraint present 
is a nonnegativity one. These facts lead us to our simple iterative La- 
grangian SVM Algorithm which requires the inversion of a positive defi- 
nite (n + 1) X (n + 1) matrix, at the beginning of the algorithm followed 
by a straightforward linearly convergent iterative scheme that requires 
no optimization package. 
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Before stating our algorithm we define two matrices to simplify nota- 
tion as follows: 

H = D[A -e], Q = ^+ HH'. (27) 

With these definitions the dual problem (25) becomes 

min fiu) \= \u'Qu — e'u. (28) 

It will be understood that within the LSVM Algorithm, the single time 
that Q~^ is computed at the outset of the algorithm, the SMW identity 
[17] will be used: 

(I + = u{I - H{t- + H'H)-^H'), (29) 

where z/ is a positive number and H is an arbitrary mxk matrix. Hence 
only an (n + 1) x (n -h 1) matrix is inverted. 

The LSVM Algorithm is based directly on the Karush-Kuhn- Tucker 
necessary and sufficient optimality conditions [24, KTP 7.2.4, page 94] 
for the dual problem (28) which are the following: 

0 < u _L Qu — e > 0. (30) 

By using the easily established identity between any two real numbers 
(or vectors) a and b: 

0 < a J_ 6 > 0 a = {a — ab)^^ a > 0, (31) 

the optimality condition (30) can be written in the following equivalent 
form for any positive a: 

Qu - e = {{Qu - e) - aiz)+. (32) 

These optimality conditions lead to the following very simple iterative 
scheme which constitutes our LSVM Algorithm: 

— Q~^{e + {{Qu'^ - e) - au^)+), z == 0, 1, . . . , (33) 

for which we will establish global linear convergence from any starting 
point under the easily satisfiable condition: 

0 < O' < ^. (34) 

We implement this condition as a = 1.9/i/ in all our experiments, where 
ly is the parameter of our SVM formulation (25). It turns out, and 
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this is the way that led us to this iterative scheme, that the optimal- 
ity condition (32), is also the necessary and sufficient condition for the 
unconstrained minimum of the implicit Lagrangian [30] associated with 
the dual problem (28): 

min Liu. a) — 

ueR^ 

= min ]-u'Qu — e'u+-^{\\{—au+Qu — e)-^\\‘^ — \\Qu — e\\‘^). (35) 
ueR^ 2 za 

Setting the gradient with respect to u of this convex and differentiable 
Lagrangian to zero gives 

{Qu - e) + —{Q - (y-I){{Q - al)u - e)^ - —Q{Qu - e) = 0, (36) 

a OL 



or equivalently: 

{al - Q){{Qu - e) - {{Q - al)u - e)+) = 0, (37) 

which is equivalent to the optimality condition (32) under the assump- 
tion that a is positive and not an eigenvalue of Q. 

In [28] global linear convergence of the iteration (33) under condition 
(34) is established as follows. 

Algorithm 4.1 LSVM Algorithm Its Global Convergence [28] 
Let Q e he the symmetric positive definite matrix defined by (27) 

and let (34) hold. Starting with an arbitrary G R^, the iterates of 
(33) converge to the unique solution u of (28) at the linear rate: 

\\Qu^+^ - Qu\\ < \\I - aQ-^\\ ■ WQu^ - Qu\\. (38) 



A complete MATLAB [32] code of LSVM which is capable of solving 
problems with millions of points using only native MATLAB commands 
is given below in Code 4.2. The input parameters, besides A, D and u 
of (27), which define the problem, are: itmax, the maximum number of 
iterations and tol, the tolerated nonzero error in ~R^\\ termina- 
tion which can be shown [28] to constitute a bound on the distance to 
the unique solution of the problem from the current iterate. 
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function [it, opt, w, gamma] = svml(A,D,nu,itmax,tol) 

7o Isvm with SMW for min l/2*u^ *Q*U“e ^ *u s.t. u=>0, 

7„ Q=I/nu+H*H\ H=D[A -e] 

7o Input: A, D, nu, itmax, tol; Output: it, opt, w, gamma 
7o [it, opt, w, gamma] = svml(A,D,nu, itmax, t ol) ; 

[m,n]=size(A) ;alpha=l .9/nu;e=ones(m,l) ;H=D*[A -e] ;it=0; 
S=H* inv ( ( speye (n+1 ) /nu+H ^ *H) ) ; 
u=nu* ( 1-S* (H ^ *e) ) ; oldu=u+l ; 
while it<itmax & norm(oldu-u)>tol 

z= ( 1+pl ( ( (u/nu+H* (H ^ *u) ) -alpha*u) -1) ) ; 
oldu=u; 

u=nu* (z-S* (H ^ *z) ) ; 
it=it+l ; 
end; 

opt =norm (u-oldu) ; w=A ^ *D*u ; gamma=-e ^ *D*u ; 



function pi = pl(x); pi = (abs(x)+x) /2 ; 

Using this MATLAB code, 2 million random points in 10- dimensional 
space were classified in 6.7 minutes in 6 iterations to e — 5 accuracy using 
a 250-MHz UltraSPARC II with 2 gigabyte memory. In contrast a linear 
programming formulation using CPLEX [14] ran out of memory. Other 
favorable numerical comparisons with other methods are contained in 
[28]. 

We turn now to our final topic of extracting very effective classifiers 
from a minimal portion of a large dataset. 

5. RSVM: Reduced Support Vector Machines [20] 

In this section we describe an algorithm that generates a nonlinear 
kernel-based separating surface which requires as little as 1% of a large 
dataset for its explicit evaluation. To generate this nonlinear surface, 
the entire dataset is used as a constraint in an optimization problem 
with very few variables corresponding to the 1% of the data kept. The 
remainder of the data can be thrown away after solving the optimization 
problem. This is achieved by making use of a rectangular m x fh kernel 
K{A^A') that greatly reduces the size of the quadratic program to be 
solved and simplifies the characterization of the nonlinear separating 
surface. Here as before, the m rows of A represent the original m data 
points while the m rows of A represent a greatly reduced fh data points. 
Computational results indicate that test set correctness for the reduced 
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support vector machine (RSVM), with a nonlinear separating surface 
that depends on a small randomly selected portion of the dataset, is 
better than that of a conventional support vector machine (SVM) with 
a nonlinear surface that explicitly depends on the entire dataset, and 
much better than a conventional SVM using a small random sample of 
the data. Computational times, as well as memory usage, are much 
smaller for RSVM than that of a conventional SVM using the entire 
dataset. 

The motivation for RSVM comes from the practical objective of gen- 
erating a nonlinear separating surface (5) for a large dataset which uses 
only a small portion of the dataset for its characterization. The difficulty 
in using nonlinear kernels on large datasets is twofold. First, there is 
the computational difficulty in solving the the potentially huge uncon- 
strained optimization problem (24) which involves the kernel function 
K(A^A') that typically leads to the computer running out of memory 
even before beginning the solution process. For example, for the Adult 
dataset with 32562 points, which is actually solved with RSVM [20], 
this would mean a matrix with over one billion entries for a conven- 
tional SVM. The second difficulty comes from utilizing the formula (5) 
for the separating surface on a new unseen point x. The formula dic- 
tates that we store and utilize the entire data set represented by the 
32562 X 123 matrix A which may be prohibitively expensive storage- 
wise and computing-time-wise. For example for the Adult dataset just 
mentioned which has an input space of 123 dimensions, this would mean 
that the nonlinear surface (5) requires a storage capacity for 4,005,126 
numbers. To avoid all these difficulties and based on experience with 
chunking methods [4, 29], we hit upon the idea of using a very small 
random subset of the dataset given by m points of the original m data 
points with rh << m, that we call A and use A' in place of A' in both 
the unconstrained optimization problem (24), to cut problem size and 
computation time, and for the same purposes in evaluating the nonlinear 
surface (5). Note that the matrix A is left intact in AT(A, A'), whereas A' 
has replaced A'. Computational testing results show a standard devia- 
tion of 0.002 or less of test set correctness over 50 random choices for A. 
By contrast if both A and A' are replaced by A and A' respectively, then 
test set correctness declines substantially compared to RSVM, while the 
standard deviation, of test set correctness over 50 cases, increases more 
than tenfold over that of RSVM. 

The justification for our proposed approach is this. We use a small 
random A sample of our dataset as a representative sample with respect 
to the entire dataset A both in solving the optimization problem (24) 
and in evaluating the the nonlinear separating surface (5). We inter- 
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pret this as a possible instance-based learning [33, Chapter 8] where the 
small sample A is learning from the much larger training set A by form- 
ing the appropriate rectangular kernel relationship K{A^A') between 
the original and reduced sets. This formulation works extremely well 
computationally as evidenced by the computational results of [20]. 

By using the formulations described in Section 3 for the full dataset 
A G with a square kernel K{A^ A') G and modifying these 

formulations for the reduced dataset A G R^^^ with corresponding 
diagonal matrix D and rectangular kernel K{A^A') G we obtain 

our RSVM Algorithm below. This algorithm solves, by smoothing, the 
RSVM quadratic program obtained from (23) by replacing A' with A' 
as follows: 



min 

S.t. 



D{K{A,A')Du-e^)+y 

y 



> 0 . 



(39) 



Algorithm 5.1 RSVM Algorithm 

(i) Choose a random subset matrix A G R^^'^ of the original data 
matrix A G Typically fh is 1% to 10% of m. 

(a) Solve the following modified version of the SSVM (24) where A' 
only is replaced by A' with corresponding D C D: 

2 2 
which is equivalent to solving (23) with A' only replaced by A' . 

(Hi) The separating surface is given by (5) with A' replaced by A' as 
follows: 

k{x',A’)Du = ^, (41) 

where (u,7) G R^^^ is the unique solution of (40), and x E RT is 
a free input space variable of a new point. 

(iv) A new input point x G R^ is classified into class +1 or —1 depend- 
ing on whether the step function: 

{K{x',A')Du-j),, ( 42 ) 



is +1 6>r zero, respectively. 
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As stated earlier, this algorithm is quite insensitive as to which subma- 
trix A is chosen for (40)-(41), as far as tenfold cross-validation correct- 
ness is concerned. In fact, another choice for A is to choose it randomly 
but only keep rows that are more than a certain minimal distance apart. 
This leads to a slight improvement in testing correctness but increases 
computational time somewhat. Replacing both A and A' in a conven- 
tional SVM by a randomly chosen reduced matrix A and its transpose 
A' gives poor testing set results that vary significantly with the choice 
of A. This fact can be demonstrated graphically as follows. 

The checkerboard dataset [18, 19] already used earlier, consists of 
1000 points in of black and white points taken from sixteen black 
and white squares of a checkerboard. This dataset is chosen in order 
to depict graphically the effectiveness of RSVM using a random 5% of 
the given 1000-point training dataset compared to the very poor perfor- 
mance of a conventional SVM on the same 5% randomly chosen subset. 
Figure 4 shows the poor pattern approximating a checkerboard obtained 
by a conventional SVM using a Gaussian kernel, that is solving (23) with 
both A and A' replaced by the randomly chosen A and its transpose A' 
respectively. Test set correctness of this conventional SVM using the re- 
duced A and A' averaged, over 15 cases, 43.60% for the 50-point dataset, 
on a test set of 39601 points. In contrast, using our RSVM Algorithm 
4.1 on the same randomly chosen submatrices A', yields the much more 
accurate representations of the checkerboard depicted in Figures 5 with 
corresponding average test set correctness of 96.70% on the same test 
set. 

6. Conclusion and Extensions 

We have described the important role of support vector machines in 
solving the key problem of classification that arises in data mining and 
machine learning. In particular we have described a general framework 
for support vector machines and given three highly effective algorithms 
for generating linear and nonlinear classifiers. In all our results mathe- 
matical programming plays key theoretical and algorithmic roles. Some 
extensions of the these ideas include multicategory classification, classifi- 
cation based on criteria other than belonging to a halfspace, incremental 
classification of massive streaming datasets, concurrent feature and data 
selection for optimal classification, classification based on minimal data 
subsets and multiple instance classification. 
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Figure 4- SVM: Checkerboard resulting from a randomly selected 50 points, out 
of a 1000-point dataset, and used in a conventional Gaussian kernel SVM (23). The 
resulting nonlinear surface, separating white and black areas, generated using the 50 
random points only, depends explicitly on those points only. Correctness on a 39601- 
point test set averaged 43.60% on 15 randomly chosen 50-point sets, with a standard 
deviation of 0.0895 and best correctness of 61.03% depicted above. 




Figure 5. RSVM: Checkerboard resulting from randomly selected 50 points and 
used in a reduced Gaussian kernel SVM (39). The resulting nonlinear surface, sepa- 
rating white and black areas, generated using the entire 1000-point dataset, depends 
explicitly on the 50 points only. The remaining 950 points can be thrown away once 
the separating surface has been generated. Correctness on a 39601-point test set 
averaged 96.7% on 15 randomly chosen 50-point sets, with a standard deviation of 
0.0082 and best correctness of 98.04% depicted above. 
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Abstract We present mathematical models for a power market on a linearized 
DC network with affine demand. The models represent the conjecture 
that each power generating company may hold regarding how rival firms 
will change their outputs if prices change. The classic Cournot model is 
a special case of this conjecture. The models differ in how arbitrage is 
handled, and their formulations give rise to nonlinear mixed complemen- 
tarity problems. In the Stackelberg version, the generators anticipate 
how arbitrage would affect prices at different locations, and therefore 
treat the arbitrage amounts as decision variables in their profit maxi- 
mization problems. In the other version, arbitrage is exogenous to the 
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firms. We show that solutions to the latter model are also solutions to 
the Stackelberg model. We also demonstrate existence and uniqueness 
properties for the exogenous arbitrage model. 



1. Introduction 

In restructured power markets, electric power generators have been 
privatized or freed of regulatory constraints on prices. The intent of 
restructuring is to provide incentives for innovation and more efficient 
production and consumption of electricity [5]. However, because of mar- 
ket failures, these benefits may not be fully realized. A market failure 
that has been of particular concern to regulators and the public is market 
power [12]. Market power is defined as the ability of a market partic- 
ipant to unilaterally alter prices in its own favor, and to sustain those 
price changes. Transmission capacity limits that restrict power imports 
and exports are an important source of market power for generating 
companies, as they allow firms within an isolated region to raise prices 
above competitive levels [2]. 

The potential for market power to be exercised within a given power 
system can be studied through laboratory experiments, empirical anal- 
ysis, and modeling. There are many models of strategic interaction in 
transmission constrained systems (for reviews, see [7] or [8]). Models can 
be used to unveil unanticipated ways in which market power might be 
exercised on networks, to identify locations where prices can be manip- 
ulated, to assess the effects of adding transmission capacity upon prices, 
and to examine the competitive effects of company mergers or divest- 
ments. The most common oligopolistic modeling frameworks employed 
in power market analyses are based on the ideas of Cournot games and 
Supply Function Equilibria (SFE), defined below. 

The purpose of this paper is to analyze the existence and uniqueness 
properties of solutions of a new model of oligopolistic power genera- 
tors. The model represents the power network using a linearized “DC” 
load flow model [13], and includes a flexible representation of interac- 
tions of competing generating firms. We term this representation the 
“conjectured supply functions” (CSFs) approach. A CSF is a function 
representing the beliefs of a firm concerning how total supply from rival 
firms will react to price. Two versions of a linear CSF have been pro- 
posed: one in which the slope of conjectured supply response is constant 
and the intercept is to be solved for, and another in which the intercept 
is given but the slope is to be determined. The former CSP yields a lin- 
ear mixed complementarity problem (MCP) for the market equilibrium, 
while the latter gives a nonlinear MCP. 
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The CSF model can be viewed as a generalization of the Cournot 
models of [7] and [14] in that the amount that rival firms are anticipated 
to adjust prices in response to a price change is not restricted to zero 
(the Cournot assumption). Instead, each generating company is allowed 
to conjecture that rival firms will react to price increases or decreases. 
By making different assumptions about the assumed supply response, 
different degrees of competitive intensity can be modeled, ranging from 
pure competition (infinitely large positive response by rivals to price 
increases) to oligopolistic Cournot competition (no response). Positive 
sloped CSFs represent a competitive intensity between the Cournot and 
pure competition extremes. A detailed justification of the CSF approach 
to modeling competition on transmission networks is given in [4], along 
with an application to the United Kingdom power system. 

It should be noted that the CSF modeling approach is distinct from 
the widely used supply function equilibrium (SFE) approach to market 
modeling [1, 9]. The SFE is a Nash game in bid functions, in which 
suppliers provide a function to a central auctioneer that relates their 
willingness to supply to the price. The SFE approach also yields prices 
intermediate between the pure competition and Cournot extremes, but is 
plagued by computational challenges along with problems of nonunique- 
ness and, in some cases, nonexistence of solutions [2]. The fundamental 
difference between the SFE and CSF approaches is that the anticipated 
supply response of competitors is endogenous in SFE models and is 
consistent with the competitor’s actual bid function, while in the CSF 
approach, the conjectured supply response of competing firms is instead 
based on an assumed parameter (slope or intercept). It is this difference 
that allows SFEs to be formulated as mixed complementarity problems 
that are relatively easy to solve and yield solutions whose existence and 
uniqueness properties can be demonstrated. 

Questions concerning the existence and uniqueness of equilibrium so- 
lutions to market models are important for two reasons. First, public 
policy is in part based on policy analyses using market models; if unique 
solutions cannot be assured, then the question arises as to whether the 
conclusions of an analysis depend on which of several possible solutions 
is selected. Second, if a solution exists and is unique, then computational 
procedures do not need to check for multiple solutions, and are therefore 
simpler. This paper focuses on the existence and uniqueness properties 
of the solution of the nonlinear MCP (fixed intercept model), as those 
properties for the linear MCP (fixed slope) are readily established using 
the results of [11]. 

The paper begins by defining notation and the profit maximizing 
problems that are common to all the models presented in this paper 
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(Section 2). Those common problems include the profit maximization 
problems for the independent system operator (ISO) who allocates scarce 
transmission capacity, and the arbitrager who eliminates any noncost- 
based price differences among nodes in the network. Consumers are 
represented by downward sloping demand curves. The various models 
introduced in the paper differ in terms of their representation of the 
profit maximization problem for the oligopolistic power producer. The 
first model, Model I, is introduced in Section 3. There, the power pro- 
ducer makes production and sales decisions recognizing that demand 
responds to price, that rival producers will react to price changes (ac- 
cording to the assumed CSF), and that noncost-based price differences 
will be arbitraged away. Inclusion of arbitrage means that the arbi- 
trager’s equilibrium conditions are introduced as constraints in the pro- 
ducer’s constraint set. After introducing the producer profit maximiza- 
tion problem, we obtain the nonlinear MCP that represents the market 
equilibrium. Section 4 presents Model II which differs from Model I 
in that the arbitrager’s equilibrium conditions are kept outside of the 
producer’s problem, resulting in a model which can be analyzed more 
fully than Model I. In Section 5, relevant theory of monotone linear 
complementarity problems is introduced which will be the basis for the 
demonstrations of the model properties. This theory is used in Section 6 
to establish the existence of solutions to Models I and II, the conditions 
under which solutions to Model II exist and certain of the variables 
(prices, total generation, sales, and profits) are unique. 

2. The ISO and Arbitrage Models 

In this and the next two sections, we present the mixed NCP formula- 
tions of the market equilibrium with conjectured supply functions. The 
resulting models become the respective linear complementarity models 
considered in [11] when the intercepts tend to minus infinity. In what 
follows, we present the NCP models, establish the existence of solutions 
and analyze their properties. 

2.1 Notation 

Before presenting the mathematical formulations for the models, we 
summarize the notation. 

Parameters 

J\f : set of nodes, excluding the hub 

A : set of transmission elements in the full network 

T : set of firms 

cxi : fixed intercept of supply function at node i 

Cfi : cost per unit generation at node i by firm / 
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P? : 
Q?: 

TL • 

CAPfi 

PDF.fc 



price intercept of supply function at node i 
quantity intercept of supply function at node i 
capacity on transmission element k 

capacity in the reverse direction of transmission element k 
production capacity at node i for firm / 
power distribution factor for node i on element A;, describing 
the megawatt (MW) increase in flow resulting from 1 MW of 
power injection at i and 1 MW of withdrawal at a hub node. 



Variables 

Sfi : amount of sales at node i by firm / 

Qfi : generation at node i by firm / 

Pfi ; price at node i anticipated by firm / 

yi ; amount of transmission service from hub H to node i 

Wi : transmission price from hub H to node i 

Pf : price at the hub node, anticipated by firm / 

afi : amount that arbitragers sell at node anticipated by firm / 

: dual variables of transmission capacity constraints in 

ISO’s problem 

7 /i : dual variable of production capacity constraint in 

firm /’s problem 

p f : dual variable of balance equation between supply and generation 

in firm /’s problem 
7Ti : market price at node i 

Vectors Sz Matrices 
1 : vector of ones of appropriate size 

I: identity matrix of appropriate order 

E: square matrix of ones of appropriate order 

II: |A/*| X 1^1 matrix of PDFijt, i E J\f and k E A 

s : (|A/*| X |^|)-vector oi Sfi, i E Af and f E P 

g: (|A/'j X |Vj)-vector of gfi, i E Af and f E P 

7T : |A/*|-vector of equilibrium prices tu, i E Af 

7 : (|A/*| X |.F|)-vector of 7 /i, i E AT and f E P 

|^|-vectors oi ^ k E A 

c: |V| X 1^1)- vector oi Cfi^ i E Af and f E P 

CAP: (|A^| X |.F|)- vector of CAP/i, i E Af and f E P. 

The components of the vectors 5, g, c, and CAP are grouped by firms; 
that is 



S = (Sl, . . . ,S|yr|)^, 



where each 5 / is the |A/’|-vector with components Sfi^ i E Af. The other 
three vectors c, and CAP are similarly arranged. Except for the 
supply intercepts and some power distribution factors, all parameters of 
the models are positive. 
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2.2 The ISO’s problem 

The ISO’s problem is the following linear program (LP). Given the 
transmission prices Wi^ i ^ A/", compute z G A/* in order to 



maximize E m Vi 
ieM 



subject to ^2 Pi = 0, 






iv) 


ieAf 








E 


< 


VA; e A, 


(V) 


ieAf 








E PDFjfcj/j 


IV 

1 


'ik € A, 


( )’ 



ieJ\f 



where we write the dual variables in parentheses next to the correspond- 
ing constraints. Note that the variables yi are unrestricted in sign. A 
positive (negative) yi means that there is a net flow into (out of) node i. 
It is trivial to note that y = 0 is always a feasible solution to the above 
LP, because the are positive scalars. The optimality conditions of 
the LP can be written as a mixed LCP in the variables yi for i G A/*, 
iov k ^ A and y, parameterized by the transmission fees wiA ^ A/*: 



0 < ± ^ PDFifc yj > 0, k e A, 

ieM 

0 < 1 T+ - ^ PBFikVi >0, k £ A, 

ieM 

0 = E 2^*’ 

ieAf 

0 = Wi + ^ PDF^^t i^k ~ i G A/* . 

keA 



( 2 ) 



2.3 The arbitrager’s problem 

The arbitrager maximizes its profit by buying and selling power in the 
market, given the prices at the nodes in the network. With ai denoting 
the arbitrage amount sold at node z, the arbitrager’s profit maximization 
problem is very simple: for fixed prices pi and costs compute a^, 
i E Af in order to 

maximize E {pi -Wi)ai 

ieAf 

subject to ^2 — 0? 5 

ieAf 



( 3 ) 
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the transmission fee at node i is included in the objective function be- 
cause the arbitrager must also pay this cost. The arbitrage amounts are 
measured as the net sales at a node; thus the sum of all the arbitrage 
amounts must equal to zero. Note that ai is unrestricted in sign. A 
positive ai represents the amount sold by the arbitrager at node i; in 
this case, the arbitrager is receiving pi for each unit sold but is paying 
Wi for the transmission. If ai is negative, then \ai\ is the quantity that 
the arbitrager bought from node i; in this case, the arbitrager is paying 
Pi per unit and paying —Wi per unit to ship out of i. 

The problem (3) is trivially solvable. In particular, this problem is 
equivalent to the two equations: 

Pi- Wi- PH = 0, V j € V 

= 0 . 

ieN 

In turn, the first equation implies 

Pi -Pj = Wi- Wj, Vi,j G V, 

which says that the difference in prices at two distinct nodes is exactly 
the difference between the transmission fees at those two nodes. 




3- Model I 



In this model, each firm that produces power anticipates the arbitrage 
amounts by including the variables afi and a supply function conjecture 
with fixed intercept in its profit maximization problem. The constraints 
that these variables satisfy are basically (4), where pi is determined by 
the price function: 



pr = F» - 

PI> - i'. QJ 



^ti + 

teE 



(Note the addition of the subscript i in pfi as this is now the price at i 
anticipated by /.) The supply function conjecture is expressed by the 
equation 



^-fi — — 

t^f 



Pfi ^i * 

TTi - ai 



Note that tt^ is a base price at which S-fi = and is exogenous to 
the firms. Substituting S-fi into the former equation and simplifying, 
we obtain 



Pfi — ( Qi ^ fi ^fi 
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Letting pf he the firm’s anticipated price at the hub, firm /’s problem 
is: with and i E J\f fixed, find Sfi^ gfi^ afi^ pfi for i G A/", 

and py in order to 

maximize {pfi ~ m)sfi - Cfi - Wi ) gfi 

ieAf i&M 



subject to Qfi < CAP/j, \/i E Af 



( ^fi 9fi ) ~ 0) 



Pfi = 



Qi ^fi ^fi + 

w 



Oii 



'Ki OLi 



-fi 



_+ -Ji. 

PP TTj — ai 



\fi e Af 



Pfi = Pf + Wi, \/i e Af 



ieAf 

Sfi, 9fi > h, Vi G M. 
The three equations 



Pfi — ( Qi ^fi ^fi + 



aj 



TTn CXj 




Mi E M 



Pfi = Pf + Wi, Mi E Af 



T. ^fi ~ 

ieAf 

uniquely determine pfi and ay^ for i E Af and py in terms of Stj, wj 
and TTj for alH G ^ and j E Af. For the purpose of restating firm /’s 
maximization problem, it suffices to solve for py, obtaining. 






• 5 _ 

iej\f ieM \^i ’^i-^i) ^ 



OLi 



Pf 



-fi 



V 91 + 

\Pi 
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Let p/(5, 7T, w) denote the fraction on the right-hand side as a function 
of the vectors s, tt, and w. This function depends on the intercepts 
ai] but since these are parameters of the model, we do not write them 
as the arguments of pf. The function pf also depends on and tt^; 
at equilibrium, the latter variables will be equated with S-fi and pfi^ 
respectively. Observe that p/ is a linear function of s/, with the other 
arguments fixed. 

We can now restate firm /’s problem in the simplified form: with su 
[t ^ /), TTi, and Wi^ i ^ J\f fixed, find Sfi and pfi for i G A/* in order to 

maximize H ( ^fi ~ ) 9fi 

ieN ieN 



subject to pfi < CAP fi, Vi G Af, (7/0 



( ^fi 9fi ) — O 5 (Pf) 

ieN 

^fi') 9fi ^ O 5 Vi G A/*. 

The above problem is a quadratic concave maximization problem in the 
variables sji and pfi for i G A/", parameterized by su for t ^ f and tt^- and 
Wi for i ^ J\f. We can write the optimality conditions for the problem as 
follows: 



f) < Sfi 1. -p/(5,7T,U;) + 



^}3 






+ 'Z’z > 0, i e V, 



0 < 5/i -1- Cfi -Wi+ 7/j - ¥>/ > 0, 



i £ M, 



0 < jfi -i CAP fi - Qfi > 0, 



i e Af, 



0 — 'E/ ( ^fi 9fi )• 
ieM 

To complete the description of the model, we need to relate the ISO’s 
problem to the firms’ problems. This is accomplished via the market 
clearing condition, which is simply a flow balancing equation: 

Vi ~ ^ ^ ( ^ti 9ti ) T ^(/)0 ^ r X Af . (5) 

In addition, we stipulate that 



Pf{s,'K,w) +Wi ~ TTi 



e TxAf 



( 6 ) 
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and s*_j:^ = s-fi for all (/, i) E !FxJ\f. From the definition ofpf{s, tt, w), 
the last two conditions yield 

feT ieAf ieAf ieAf * 

which expresses the total sales S of all firms in all markets in terms of 
the market prices tt^. Substituting (6) into the last equation, we obtain 

Pf{s, 7T,w) = (l] Qi- 1] 1^,"] 

\ieAf ieAf * / / \ieAf * / 

which shows among other things that p/(5, tt, m) is the same for all firms 
and 

Pfi = Pf{s,7Z,w) +Wi = TTf, 

thus the price pfi is independent of the firms. Substituting the equality 
Pfi = into the expression 



Pfi — ( Qi ^fi (^fi + 



a. 



TT'j cy.j 



^-fi 



Qi I 

TTi - ai 



( 7 ) 



yields 






(^fi — Qi ~ '^i ~ 2^ ^ti") 

teT 



which shows that the arbitrage amounts anticipated by the firms depend 
only on the market i. Substituting the expression 



7TV 



\jeM jeAf i / / \je^^ i 



= a +Yf, [5ij - pi)wj - (jj S 
jeM 

into the above expression for we obtain 

n/i = Ri — ^2 ^ti + Pi S — ^2 Ci: 
teT jeAf 

where 

1 



UJ = 



CT = 

ieM 

pO 



p9 

ieAf * 



91 

pP‘ 



Pi ^ ^ Q?-$a, iEM, 
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and with 5ij denoting the Kronecker delta; i.e., Sij is equal to one Hi = j 
and equal to zero if i 7 ^ j, 

^ij ~ ~n0 ( ~ Pj)^ ^ 



# ifi^J 



pO pO 
i 3 



if i^j. 



Substituting the above expression for afi into (5), we obtain 

Vi = Ri 9ti + Pi s - Cy Wj. (8) 

jeN 

Proceeding as in [11], we can show that the resulting Model I is to 
compute 

{Afc : k e A}, {sfi,gfi,jfi : i e Af, f e T}, and {cpf : f e 



0 < -L + ^2 ^2 Cij PDFjf (A^ 

ieA ijeN 



Y^PiPDFik] S-^Y.^BFik9fi > 0 , ykeA, 

ieA/' / 



0 < A+ ± 9 + + ^ PDFifc Cv PDF,-, ( A+ - A 7 

i£A i,j^N 



^PiPDFifc > 0, VkeA, 

ieN / ieMfeP 



J 2 Vi 

0 ^fi D — (T + ca S' H 7~r^~ ^ ~F P'fk 

E S + ^ 



V n V - «i 



^ ^PjFDFjk (A+-Aj^) > 0, V(/,i)€J^xV, 

keA jeN 
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0 < 5/j J- Cfi + ^PDF,,(A^-A+)+ 

keA 

7if -v'f > 0, € T X J\f, 

0 < 7/i -L CAP fi - Qfi > 0, \/{f,i)eTxAf, 

0=^ i^fi-9fi), V/ e :r, 

ieM 

where 

VA:€A 

iGA/" 

In vector form, we have, 

q: T n^sp°. 



We observe that 



-Ki = a -ujS+^'^{5ij-pj) PDFjfc {\t -\) 
jeAfkeA 

po 

= ''-"S+74EEfePDf',t(A+-At), 

jeAfkeA 

which expresses the regional prices tt^ in terms of the total sales S and the 
dual variables of the transmission capacity constraints. Subsequently, we 
show that TTi is uniquely determined by the total sales only. 

Let hi : 5jl-^|x|*^l be defined by 






E VI 

jeAf 



V Pj V - 



L|AT|, 



V/ e 



where tt^ is given by (9). We can now write Model I in vector-matrix 
form. First we assemble the variables of the system in two vectors: 



X 



(A 



s 



e Sft( 2 MI+ 3 |AA|x|^|) ^ g 



Ui 
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(For notational simplification, we drop the ' in the variable (p). Next, 
we define the \A\ x \A\ symmetric positive semidefinite matrix 

A = n^sn; 

also define two matrices in partitioned form: 



oli — 

where 

Ma = 

Mxs = 



■ Ma 


Mxs 


Ma, 


0 ■ 




0 




Ms 


0 


0 


, N = 


J 


-{MxgV 


0 


0 


I 




-J 


0 


0 


-I 


0 _ 




0 



A -A 
-A A 



lA^i nVi^i 

... -n^pi|5}| 



g S)f{2|^|x(|A/|x|;r|)^ 



Ma„ = 



-n^p 

-n^ . 

. 

E E 
E E 



Me = uj 



1 ^ 



-n^ 

.. E 
.. E 



g SR2|^|x(K|x|^|) 



e J{(|AA|X|^|)X(|A/|X|^|) 



E E ... E 

(with each E being the |A/”| x |A/"| matrix of all ones) 

1|A/-| 0 ... 0 



0 



L|AA| 



0 



L|A/-| 



gj(|AT|x|^|)x|^|_ 
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The matrix Af oii is square and of order (2|.4.| + 3|A/’| x | J^|); the matrix N 
is rectangular and of order {2\A\ + 3|A/*| x \ J^\) by \T\. Define a constant 
vector partitioned in accordance with Mqu: 



( r \ 



9oli = 



1|AA|x|:f| 



e Sft(2m+3|Ar|x|^|)_ 



CAP / 



With the above vectors and matrices, Model I can now be stated as the 
following mixed NCP: 



0 < X -L goii + MoiiX + N(p + 



0 = N^x. 



( 0 \ 

0 

hi{s,X^) 

0 

V 0 ) 



> 0 



( 10 ) 



If not for the nonlinear function hi{s, A^), Model I would be a linear 
complementarity problem (LCP), which is exactly the one treated in [11]. 
The existence of a solution to (10) relies on bounding the components 
hij(s, A^) for all f E T. In turn, this relies on bounding the prices 
for i E Af. In Section 6, we show how to obtain the necessary bounds 
via LCP theory. 

4. Model II 

In this model, each firm takes the arbitrage amounts as input param- 
eters in its profit maximization problem. Specifically, with the price pi 
given by (7), firm /’s problem is: with 7Tj fixed for 




Oligopolistic Market Equilibria in Linearized DC Power Networks 
all i G A/", find Sfi and gji for alH G A/* in order to 



127 



Qi s fi di + 

E TXi — OL 

no ^5 

ieM ^ M 

V Pf ■Ki- ai 



5* 






ieN 

subject to Y^(sfi-gfi) = 0 

ieN 

and Sfi > 0^ 0 < gfi < CAPj^, V(/, z) G TxM. 

Model II is complete with the inclusion of the ISO’s problem plus the 
arbitrage constraint: 



Qi Sfi di + 



H * 

s_ 



-Wi-PH = Q, V * € A/', 



^ I J ^ 

PP 7t,: — ai 



^ Oj = 0, 

i£N 

the flow balancing equation (5): 

Vi — ^ 9ti ) "b ^fii ^(/)0 ^ P ^ A/", 

and the price equation 

•TTj = -Wj +p/, yi e Af. 

Following a similar derivation as before, we can show that Model II can 
be formulated as the following NCP: 

( ° 

0 

0 < a? _L M q\{X N( f /in(s,A^) 

0 

V 0 

0 = N^x, 
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where with Tr* given by (9), hu : s)[j|:F|x|A/'| jg gjygj^ ] 3 y 

h„ji(s,A±) = 

I 

Pf TTi - at 

where tt^ is given by (9). The two NCPs (10) and (12) differ in the two 
closely related nonlinear functions hi and hu. 

5. Complementarity Theory 

The key to the analysis of Models I and II is the theory of monotone 
LCPs. This theory in turn yields an existence result of a special vari- 
ational inequality that is the cornerstone for the existence of solutions 
to the supply-function based market models. In this section, we present 
the prerequisite LCP theory and its implications. 

We begin by recalling that the LCP range of a matrix M G 
denoted TZ{M)^ is the set of all vectors g G for which the LCP (qr, M) 
has a solution. Our first result pertains to the solutions of an LCP 
defined by a symmetric positive semidefinite matrix. Although part (a) 
of this result is known and parts (b) and (c) hold in more general contexts 
(see [6]) we give a full treatment of the following theorem because it is 
the basis for the entire subsequent development. 

Theorem 1 Let Af G be a symmetric positive semidefinite ma- 

trix. 

(a) For every q G TZ{M)^ the solutions of the LCP (g, M) are u;-unique; 

that is, if and are any two solutions of the LCP (g, JW), then 
Mz^ = Mz‘^. Let w{q) denote the common vector q + Mz for 
any solution of the LCP (q^M). 

(b) There exists a constant c > 0 such that 

\\w{q)\\ < c\\q\l \/q G 7^(M). 

(c) The function w : TZ{M) 3?^ is continuous. 

Proof. Statement (a) is a well-known result in LCP theory. We next 
prove (b) by contradiction. Suppose no constant c satisfying (b) exists. 
There exists a sequence of vectors {q^} C TZ{M) satisfying 

IM9")II > k\\q>^\\ 

for every k. We have w{q^) ^ 0 for every k and 

lim -n — 7 — rv 77 = 0. 

k-^oo II w{q^) II 
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Without loss of generality, we may assume that 



lim 

k-^oo 



w{q^) 

w{q^) 



V 



oo 



for some vector v°°, which must be nonzero. We may further assume 
that 

supp(u;(g^)) = {i : Wi{q'^) >0} 

is the same for all /j, which we denote a. With a denoting the comple- 
ment of Of in {1, . . . , n}, we have, for every A;, 



0 = g^ + Maa4 

Wa{q^) = q'^ + MaaZ^ > 0 



for some vector > 0. Dividing by ||u;(qf^)||, we deduce the existence of 
a nonnegative vector z^, which is not necessarily related to the sequence 
such that 

0 = Ma&z'^ 

= Ma&zf. 

Since M is symmetric positive semidefinite, the above implies thus 
v°°, is equal to zero. This contradiction establishes part (b). 

To prove part (c), let {q'^} C TZ{M) converge to a limit vector q°° 
which must necessarily belong to TZ{M) because the LCP range is a 
closed cone. For each k, let € SOL{q'^,M) be such that w{q^) = 
q^ + Mz^. The sequence {w(g^)} is bounded; moreover, if ■u;°° is any 
accumulation point of this w-sequence, then using the complementary 
cone argument, as done in the proof of part (b), we deduce the existence 
of a solution z°° € SOL{q°° , M) such that w°° = q°° + Mz°°. This 
is enough to show by part (a) that the sequence {w{q'^)} has a unique 
accumulation point which is equal to w{q°^). Therefore the continuity 
of the map w{q) at every vector q G TZ{M) follows. Q.E.D. 

Our goal is to apply the above theorem to the matrix 



A -A 
-A A 



[ n -n 



For this purpose, we derive a corollary of Theorem 1 pertaining to a 
symmetric positive definite matrix of the above form. 

Corollary 1 Let M = A^EA, where E is a symmetric positive semidef- 
inite m X m matrix and A is an arbitrary m x n matrix. 




130 



(a) For every q G TZ{M), if and z‘^ are any two solutions of the LCP 

[q^M)^ then EAz^ = EAz‘^. Let w{q) denote the common vector 
EAz for any solution 2 : of the LCP (q^M). 

(b) There exists a constant c' > 0 such that 

\\Mq)\\ < c'Wql Vg G 7^(M). 

(c) The function w : TZ{M) 3?^ is continuous. 

Proof. We note that for any nonzero symmetric positive semidefinite 
Af , we have 

where is the smallest positive eigenvalue of M and Xma,x{^) 

is the largest eigenvalue of M. With M = A^EA^ it follows that 

Mz 0 <=> EAz = 0. 

Hence, for every q G TZ{M)^ EAz is a constant for all solutions of the 
LCP (q^M). Moreover, there exists a scalar c > 0 such that for every 
qen{M), 

II II ^ ( 1 + c) II g II, 

for every solution 2 : of the LCP (g, M). Since 

-A- — rllMzf > z^Mz = (Az)'^EAz > - — ^-—llEAzf 

for all z G 3?’^, part (b) of the corollary follows readily. The proof of part 
(c) is very similar to that of the same part in Theorem 1. Q.E.D. 

It can be shown, using the theory of piecewise affine functions, that 
both functions w{q) and w{q) are Lipschitz continuous on TZ{M). Since 
this Lipschitz continuity property is not needed in the subsequent anal- 
ysis, we omit the details. 

5.1 An existence result for a special VI 

In what follows, we establish an existence result for a linearly con- 
strained variational inequality (VI) of a special kind. This result will 
subsequently be applied to Models I and II of power market equilibria. 
The setup of the result is a VI (AC, F), where K is the Cartesian product 
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of two polyhedra K\ C and K 2 C with K 2 being compact. The 
mapping F is of the form: for (x^y) G 



F{x,y) 



q \ Mil Mi 2 

r j M 21 M 22 




where h : is a continuous function and the matrix 



M = 



Mil Mi 2 
M21 M22 



(13) 



is positive semidefinite (not necessarily symmetric). In the following 
result, an AVI is a VI defined by an affine pair {K, F), i.e., K is a, poly- 
hedron and F is an affine map. (We refer the reader to the monograph 
[6] for a comprehensive treatment of the finite-dimensional variational 
inequalities and complementarity problems.) 



Proposition 1 In addition to the above setting, assume that for all 
y e K 2 , the AVI (AT, F^) has a solution, where 



FHx,y) 



q 

r + h{y) 



+ 



Mil Afi2 

M21 M22 



X 

y 



{x,y) e 



The VI (AT, F) has a solution. 



Proof. We apply Kakutani’s fixed-point theorem to the set- valued map- 
ping r : AT 2 ^ A '2 defined as follows. For each y G AT 2 , F(y) consists of 
all vectors y ^ K 2 for which there exists a vector x £ Ki such that the 
pair (x,y) solves the VI {K^Fy). Clearly, F(y) is a nonempty subset of 
K 2 \ r(y) is convex because if y^ and y^ are any two elements in F(y) 
and x^ and x‘^ are such that (x^y^) G SOL(A^, F^) for i = 1,2, then 
r(x\y^) + (1 — r)(x^,y^) remains a solution of the VI (AT, F^) for all 
scalars r G (0, 1), by the positive semidefiniteness of the matrix M. We 
next verify that F is a closed map. For this purpose, let {y^} be a se- 
quence of vectors in K 2 converging to a vector y^ in K 2 and for each k 
let (x^, y^) be a solution of the VI (iF, F^^) such that the sequence {y^} 
converges to a vector y^. We need to show the existence of a vector x^ 
such that the pair (x°^,y^) solves the VI {K^Fy°^). Write 

Ki = {x e 3?^' : Ax < b} 



and 

K 2 = {y e : Cy < d}. 
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For each k, there exist multipliers and such that 

/ q V [ ^11 ^12 1 / ^ V \ ^ p 

V ^ + Kf) M22 J V / / ^ V c- V ) ~ 

0 < ± Ax’^ -h 

0 < 7 /*^ X Cy^ — d < 0. 

Again by a standard complementary cone argument, we can deduce the 
existence of 77 °°, and ^ which are not necessarily the limits of the 
sequences {q^} and {rr^}, respectively, such that 

/ q \ r Mu Mi 2 1 / \ ^ 

Iv r + h(y°°) J ^ [ M21 M22 J V y°° J ^ V } ~ 

0 < / 7 °° X Ax°° -b < 0 

0 < r?°° X Cy°° -d < 0. 

This establishes that F is a closed map. In particular, it follows that 
F(y) is a closed subset of K 2 for all y m K 2 - Thus F satisfies all the 
assumptions required for the applicability of Kakutani’s fixed-point the- 
orem. By this theorem, F has a fixed point, which can easily be seen to 
be a solution of the VI {K, F). Q.E.D. 

6. Properties of Models I and II 

Returning to the mixed NCPs (10) and (12), we consider the following 
LCP in the variables A^, parameterized by S and g: 



0 < A- X g- + A( A- - A+ ) + ^ 5 / >0 

V y 

(14) 

0 < A+ X 9 + + A(A+-A-)-n^ ( p5- ^ 5 / 1 >0. 

We want to derive a sufficient condition under which the above LCP will 
have a solution for all “feasible” sales and generations. Specifically, let 

y ^ Ei i^f^9f) e : 
fer 

( ^fi ~ 9fi ) ~ 0 ) 9fi ^ CAPyj, y ( f ,i) S JFxA/">, 

ieAf } 
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be the set of such sales and generations. The set T is a compact poly- 
hedron in We have for all pairs (s,p) € Y and every j G A/", 

Pj ^ ~ ^ ~ Pj ) 9fi- 

feP feN ieN 



Thus 



jeN jeN 

= -E E 

fePieN jeN 



= -E E EPD^jiCi, 

fePieN \jeN 



P? 



0 9p- 



Therefore, the LCP (14) can be written as; 



0 < 



A- 

A+ 



± 



-T+ 



-f 



-n^ 



s [ n -n ; 



-n^ 

A- 



A+ 









> 0 , 



(15) 



where D is the |A/*| x |A/*| diagonal matrix with diagonal entries Pf /Q^. 
Proposition 2 If there exists a vector A G satisfying 

-T~ < n^SP^ + n^HHA < T+, (16) 

then the LCP (14) has a solution A^ for every pair {s^g) G Y , 

Proof. The LCP (15) is of the form: 

0 < 2 : ± q + A^Er + A^EAz > 0, 



where E is a, symmetric positive semidefinite matrix. It follows from 
LCP theory that if there exists a vector z satisfying q + A^Ez > 0, 
then the LCP {q + A^Er^ M), where M = A^EA^ has a solution for 
all vectors r. Q.E.D. 

Throughout the following discussion, we assume that condition (16) 
holds. Thus the LCP (15) has a solution for all vectors g. Moreover, 
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specializing Corollary 1 to the matrix we deduce that for any vector 
g, if (A“’\ A+’^) for i = 1, 2 are any two solutions of (15), we have 

sn(A+’^ - A“’i) = sn(A+’2 - a"’2). 

Furthermore, if ^{g) denotes this common vector, then $ is a Lipschitz 
continuous function from into In terms of this function, 

we have 

TTi = a - uj S + ^i{g), Mi e Af, (17) 

which shows that tt^ is a function of the total sales S and the generations 
g. Since each $ is continuous and Y is compact, it follows that for each 
i G A7, the scalar 

<^i = min{(j - a;5 + $i(g) : (s,g) e Y} 
is finite. Therefore, if the intercepts a{ satisfy 

ai < Q, Mi G A7, 



then the denominators in 
and 

are positive for all (s,g) € Y and all f ^ T. Notice that as a result of 
(17), we can replace the dependence on in the two functions h\ and 
h\i by the dependence on g instead. The computation of each scalar 
Q requires the solution of an mathematical program with equilibrium 
constraints [10] that has a linear objective function and a parametric, 
monotone LCP constraint. 



6.1 Existence of solutions 

Both NCPs (10) and (12) are equivalent to a VI of the type consid- 
ered in Proposition 1. More specifically, define the following principal 
submatrix of Moii by removing the last row and column: 



oli — 



M\S M\g 
-{Mxs)'^ Ms 0 

-{MxgV 0 0 
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the matrix Mq\[ is of order {2\A\ +2|A/*| x \T\). Define a reduced vector 
Qoli accordingly: 

Let ni = 2\A\ and ri 2 = 2|A/’| x \J^\. Identify the vectors x and y with 
and (5,g), respectively, the constant vector (g, r) with the matrix 
M with Moii and the function h{y) with either 

h{s,g) \ / hn{s,g) \ 

0 ; ' I 0 A 

Furthermore, let K\ be the nonnegative orthant of and be the 
set Y: 

i ^2 = n { (^/>5/) e ■■ 

/e^ 

9fi ) = 0, Qfi < CAP fi, V (/, i) € r xAf 

ieAf 

Under the above identifications, models I and II can therefore be formu- 
lated as the VI {Ki x K 2 ^F), where F is given by (13). We can readily 
apply Proposition 1 to establish the following existence result for the 
two models. 

Theorem 2 Suppose that there exists a vector A G satisfying (16). 
If 

ai < min {a - cu S + ^i{g) : ( 5, g ) G V }, Vi G A/", (18) 

then solutions exist to Models I and II. 

Proof. Under the assumption on the intercepts the function h{s^g) 
is well defined on the set K 2 . For every y = (s^g) G K 2 ^ the VI (i^, F^) 
is equivalent to the mixed LCP in the variable (a?, (p)\ 

0 < aj _L + Moiiic + iV(^ + (0,0, h(s,g))^ > 0 
0 = N^x. 

This mixed LCP is clearly feasible and monotone. It therefore has a 
solution. The existence of solutions to Models I and II follows readily 
from Proposition 1. Q.E.D. 

The next result identifies two important properties of the solutions to 
Model II. It shows in particular that Model I can be solved by solving 
Model II. 
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Theorem 3 Under the assumptions of Theorem 2, if (A^,5,g,y?) is a 
solution to Model II, then V i j and V / G ^ 




Therefore every solution to Model II is a solution to Model I. 



Proof. It suffices to show that for all i j and all / G .F, 




This is because by reversing the role of i and we obtain the reverse 
inequality and equality therefore must hold. The above inequality is 
clearly valid if Sfj = 0. So we may assume that Sfj > 0. By comple- 
mentarity, we have 






9l + '^-/j 

Pj 



= cr - ujS - iff 

keA 



XI Pj'PDFj/fc 
j'eAf 



i4->^k) 



< 



91 

p? 



Sfi 



+ 



^-fi 

Oii 



This establishes the first assertion of the theorem. To prove the second 
assertion, we note that by what has just been proved, it follows that if 
(A^, 5,gf, cp) is a solution to Model II, then we must have 



Sfi 



'91^ 

pp 



5_ 



fi 



TTi CXi 



jeAf / jeAf 



91 



+ 



S- 



fj 



TTj aj 



This shows that (A"^, s^g) is also a solution to Model I. 



Q.E.D. 



6.2 Uniqueness in Model II 

In this subsection, we show that if each price intercept ai is suitably 
restricted, then the firms’ sales in the market model II are unique. The 
cornerstone to this uniqueness property of the model solutions is the 
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expression (9). Based on this expression, we show that the mapping 



Fn{x,^) = f j + 



Moii N 

-N 0 



X 



‘P 



+ 



( 0 \ 

0 

/mi(s,a±) 

0 

V 0 J 



is monotone. Throughout the following analysis, we restrict the pair 
(5, A^) so that 

a-uS+Y. ^(<5.._p.)PDF,-fc(A+-A^) > V* e N. 

jeJ\f keA 

To establish the desired monotonicity of f ip we first compute the Jaco- 
bian matrix of the function hn(s, A^). 

We begin by noting the following partial derivatives: 



diTj 

dsfi> 



— -cj, V / G G A7 



and 



rinr- 

T = ± 7® E fePDFjt ViiM,kiA. 






Next, recalling that 

huji{s,X^) = 



Sfi 



Qj I ^-fi 
PP TTi - ai 






we have for all f E. P, 
1 



dhuji ^ I 

dsfi' 



Qj^ _| ^ -fi 



^fi 



0 . - 91 , ^zZi. 

TTi - ai [ po + 



{%i - aiY 



TV j Oij 



^fi 



U 



fi 



Q! 



+ 



S-fi 



(tTz - ai) 



.\2 



if i = i' 



\ii ^ i' 



Pi 7Ti - ai 
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and for all f f'^ 



dhuji 

ds j'i' 



= < 






Q 



S- 



fi 






^i- ai 

^ fi 



Ql 



+ 



-fi 



2 \TTi- tti ( TTj - aj )2 



{■Ki - a ,)2 



if i = i' 



Hi ^ i'. 



Pf TTi-ai J 

Moreover, for all / G i G A/" and k E 

^ S E C«pdf,* 



9>'t 



Qf s^fi yi^i-o^i^Q^i 



Pf TTi- ai 



jeAf 



Therefore, the Jacobian matrix of /in ( 5 , A^) has the following partitioned 
form: 



A{, 



^1\T\ 



jdX dA+ as ... 45 

JD\n~\ -D\'Tr\ -^ l^li ^ 



VI 1*^1 l-^ll 



mi^l 



where each Ajj, is an \Af \ x \Af \ matrix with entries 



( A^ff , ) 



... = Vi.i'eAT 



ds fi' 



and each is an |A/*| x \A\ matrix with entries 



= yieAf,keA. 



dXi 



Consequently, the Jacobian matrix of Fu{x^(p) can be written as the 
sum of two matrices L\ and L 2 , where 



Li 





- A 


-A 


0 




0 


0 


0 




-A 


A 


0 




0 


0 


0 




Br 










0 


0 








^\T\i 






0 


0 




0 


0 


0 


0 


0 


0 


0 




0 


0 


0 


0 


0 


0 


0 
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and 



L2 = 



0 Mxs Mxg 0 0 

-{Mxs)'^ Ms 0 0 J 

-{Mxg)^ 0 0 / -J 

0 0-100 
0 -J J 0 0 



The matrix L 2 is skew-symmetric, thus positive semidefinite. To show 
that L\ is also positive semidefinite, recall that A == n^ETI. Further- 
more, we have 

V/ G 

for some |A/"| x \Af \ diagonal matrix Df with 



( Df )ii — 



^fi 



91 

pP 



+ 



-fi 

TTj — aq 



S - 



fi 



pP 

-A n 



{■Ki-aif Qf 



Vi e V. 



Let 





r Ml ■ 


AS -] 

M\t\ 




■ Di ■ 


A = 


1 


J 


and D = 


. . 



Notice that each block is a function of s and tt; so is each matrix 
Df. Consider the matrix 



Li 




[ n n ] 0 



DH [ n n ] A 



Clearly, Li is positive semidefinite if and only if Li is so. The next 
lemma shows that the latter matrix is positive semidefinite. 



Lemma 1 For every compact set C there exists a 

such that if 

ai < a, Vi G V, 

the matrix Li is positive semidefinite for all (s, tt) G fi. 
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Proof. The symmetric part of the matrix L\ is equal to 



- yiT ■ 


s [ n -n ] i 


■ ■ 


1 

[I] 


-n^ . 


L J 2 


. -n^ . 




^DE 


[n -n] 


A 





which we can write as the sum of two matrices: 



- - 






1 

o 

o 

0 

1 


-n^ 


e[u -n 


+ 


0 0 0 


- 5^ - 






_ 0 0 A-\DED^ _ 



The first summand is clearly positive semidefinite. Provided that the 
pair (5,7 t) is bounded, the matrix A — is positive definite for 

all ai with \ai\ sufficiently large. Q.E.D. 

Each pair [s^g) in the compact set Y induces a price vector tt via 
the expression (9), where is a solution of the LCP (14). The induced 
prices are bounded by the continuity of the function $ and the bounded- 
ness of Y] cf. (17). Let be a compact convex subset of 
containing all such pairs (5,7 t). Corresponding to this set we may 
choose a such that the Jacobian matrix of is positive semidef- 

inite for all pairs {x^ip) belonging to a convex set that contains all so- 
lutions of Model II. Thus Fn is monotone on this set. Based on this 
monotonicity property, we can establish the desired uniqueness of the 
sales and other variables in Model II. 

Theorem 4 Under the assumptions of Theorem 2, there exists a such 
that if 

ai < yi E A/*, 

the following variables are unique in the solutions of Model II: 

(a) the sales s fi for all / G F and i G A/*; 

(b) the prices tt^ for all i G A/*; 

(c) the total generations for all / G F; and 

(d) the profits for each firm. 

Proof. Let and {x‘^^pP‘) be two solutions of Model II. Let tt^ 

and 7T^ be the induced prices. By the monotonicity of Fn, it follows 
that ^ 

( *2 ) ( ) = 0 - 
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Let - A“’* and i = 1,2. We have 

(Ai-A2)^n^Sn(Ai-A2)+o;(Si-52)2+ 

( 5^ — 5^ ) ^( hi{s^^ A=^’^) — A^’^) ) = 0. 

By the mean- value theorem, it follows that for some triple (5,7 t) on the 
line segment joining (5^,7 t^) and (s^,7t^), 



A+’^ - A+’2 \ 


T 




- • 


H [ n n ] 0 




/ A+’i - A+>2 \ 


A->i - A->2 






-n^ 






A--1 - A-’2 


J 






DE 


[ n n ] A _ 




1 ; 



cu{S^ -S^)^ - 0, 

where the matrices D and A are evaluated at (s,^). By the proof of 
Lemma 1, it follows that 

= 5^ and 5BA^ = SBA^ 

This yields Since 

E w = E */i. V/ € F, 

ieAf ieN 

it follows that each firm’s total generation is unique. 

Finally, to show that the profit for each firm is unique, note that the 
profit of firm / is equal to 

Pf{s, 7T, w) J2 ^fi- ~'^i)9fi = XI ( TTj - Cfi ) Qfi, 

ieAf ieM ieN 

because tt^ = pf{s^7r^w) + wi (by (6)) and the sum of sji over all i in 
M is equal to the sum of gfi over all i in J\f. Let (5 j, (p) be an arbitrary 
solution of Model II. Consider the linear program in the variable gf = 
{gfi -.ieX): 



maximize E ( TTf Cfi ) gfi 
ieAf 

subject to 0 < gfi < CAP fi^ e Af 



and 



9fi — ^fi- 
ieN ieN 



(19) 



Since for i e M and Yli^j\f s fi are constants of Model II, it follows 
that the above linear program depends only on the firm / and does not 
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depend on the pair (®, cp) of solution to Model II. The optimal objective 
value of the linear program gives the profit of firm /. Q.E.D. 
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Abstract Optimization techniques are essential ingredients of reliability-oriented 
optimal designs of technical facilities. Although many technical aspects 
are not yet solved and the available spectrum of models and methods 
in structural reliability is still limited many practical problems can be 
solved. A special one-level optimization is proposed for general cost- 
benefit analysis and some technical aspects are discussed. However, 
focus is on some more critical issues, for example, ’’what is a reason- 
able replacement strategy for structural facilities?”, ”how safe is safe 
enough?” and ”how to discount losses of material, opportunity and hu- 
man lives?” . An attempt has been made to give at least partial answers. 



Keywords: Structural reliabiltiy, optimization, risk acceptability, discount rates 

1. Introduction 

The theory of structural reliability has been developed to fair matu- 
rity within the last 30 years. The inverse problem, i.e. how to determine 
certain parameters in the function describing the boundary between safe 
and failure states for given reliability, has been addressed only recently. 
It is a typical optimization problem. Designing, erecting and maintaining 
structural facilities may be viewed as a decision problem where maxi- 
mum benefit and least cost are sought and the reliability requirements 
are fulfilled simultaneously. In what follows the basic formulations of the 
various aspects of the decision problem are outlined making use of some 
more recent results in the engineering literature. The structure of a suit- 
able objective function is first discussed. A renewal model proposed as 
early as 1971 by Rosenblueth/Mendoza [42], further developed in [17], 
[40] and extended in [36], [18] is presented in some detail. Theory and 
methods of structural reliability are reviewed next where it is pointed 
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out that the calculation of suitable reliability measures is essentially an 
optimization problem. Focus is on the concepts of modern first- and 
second reliability methods [20]. The problem of the value of human life 
is then discussed in the context of modern health-related economic theo- 
ries. Some remarks are made about appropriate discount rates. Finally, 
details of a special version of modern reliability-oriented optimization 
techniques based on work in [26] are outlined followed by an illustrative 
example. 



2. Optimal Structures 

A structure is optimal if the following objective is maximized: 

Z(p) = B{p) - C(p) - D{p) (1) 

Without loss of generality it is assumed that all quantities in eq. (1) 
can be measured in monetary units. 5(p) is the benefit derived from the 
existence of the structure, C(p) is the cost of design and construction 
and -D(p) is the cost in case of failure, p is the vector of all safety 
relevant parameters. Statistical decision theory dictates that expected 
values are to be taken. In the following it is assumed that S(p),C(p) 
and D{p) are differentiable in each component of p. The cost may differ 
for the different parties involved, e.g. the owner, the builder, the user 
and society. A structural facility makes sense only if Z{p) is positive for 
all parties involved within certain parameter ranges. The intersection of 
these ranges defines reasonable structures. 

The structure which eventually will fail after a long time will have to 
be optimized at the decision point, i.e. at time t == 0. Therefore, all cost 
need to be discounted. We assume a continuous discounting function 
6{t) — exp [—"ft] which is accurate enough for all practical purposes and 
where 7 is the interest rate. 

It is useful to distinguish between two replacement strategies, one 
where the facility is given up after failure and one where the facil- 
ity is systematically replaced after failure. Further we distinguish 
between structures which fail upon completion or never and struc- 
tures which fail at a random point in time much later due to service 
loads, extreme external disturbances or deterioration. The first option 
implies that loads on the structure are time invariant. Reconstruction 
times are assumed to be negligibly short. At first sight there is no 
particular preference for either of the replacement strategies. For infras- 
tructure facilities the second strategy is a natural strategy. Structures 
only used once, e.g. special auxiliary construction structures or boosters 
for space transport vehicles fall into the first category. 
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3. The Renewal Model 

3.1 Failure upon completion due to 
time-invariant loads 

The objective function for a structure given up after failure at 
completion due to time-invariant loads (essentially dead weight) is 

Zip) = B* Rf ip) - Cip) - HPf ip) C(p) - {B* + H)Pf ip) (2) 

Rf{p) is the reliability and Pf{p) = 1 — Rf{p) the failure probability, 
respectively. H is the direct cost of failure including demolition and 
debris removal cost. For failure at completion and systematic re- 
construction we have 



Z{p) =B*~ C{p) - (C(p) + H)Y,iPfipyRfip) 

i=l 

= B'-C(p)-(C(p) + /f)j^^ (3) 

After failure one, of course, investigates its causes and redesigns the 
structure. However, we will assume that the first design was already 
optimal so that there is no reason to change the design rules leading to 
the same f/(p). If each structural realization is independent of each 
other formula (3) holds. 

A certain ambiguity exists when assessing the benefit B* taken here 
and in the following as independent of p. If the intended time of use of 
the facility is tg it is simply 

= B{ts) = r b{t)6{t)dt (4) 

Jo 

For constant benefit per time unit b{t) = b one determines 

B* = B(ts) = - [1 - exp [-7 ^b]] - (5) 

'y Cs->oo "y 

3,2 Random Failure in Time 

Assume now random failure events in time. The time to the first event 
has distribution function Fi (t, p) with probability density fi (t, p) . If the 
structure is given up after failure it is obviously 

B{tg)= f b{t)S{t)Ri{t,p)dt 
Jo 



( 6 ) 
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D{ts) = r h{t,p)d{t)Hdt (7) 

Jo 

and therefore 

Zip) = r bimt)R,{t,p)dt - C(p) - r 6{t)hit,p)Hdt (8) 
Jo Jo 

For tg oo and /i(7, p) — e~'^^fi{t^p)dt the Laplace transform of 
/i(t, p) it is instead 

Zip) = ^ [1 - /i*(7, p)] - C(p) - (7, p) (9) 

For the more important case of systematic reconstruction we gen- 
eralize our model slightly. Assume that the time to first failure has 
density fi{t) while all other times between failure are independent of 
each other and have density /(t), i.e. failures and subsequent renewals 
follow a modified renewal process [11]. This makes sense because extreme 
loading events usually are not controllable, i.e. the time origin lies some- 
where between the zeroth and first event. The independence assumption 
is more critical. It implies that the structures are realized with indepen- 
dent resistances at each renewal according to the same design rules and 
the loads on the structures are independent, at least asymptotically. For 
constant benefit per time unit b{t) = b we now derive by making use of 
the convolution theorem for Laplace transforms 

roo QQ roo 

Zip) = / be^'^^dt - C(p) - (C(p) + -ff) y] / e~'^^fnit,p)dt 

n=l 

= ^-C(p)-(C(p)+«)j^ 

= ^-C(p)-(0(p) + JJ)/.:(7,p) (10) 

where /ii(7, p) is the Laplace transform of the renewal intensity h\(t^ p). 
For regular renewal processes one replaces /i(7, p) by /*(7, p). For the 
renewal intensity and its Laplace transform there is an important asymp- 
totic result [11]: 




lim /i(i,p) = lim7/i*(7,p) = 

t->oo 7^0 



1 

m(p) 



( 11 ) 



where m(p) is the mean of the renewal times. 
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If, in particular, the events follow a stationary Poisson process with 
intensity A we have 

POO \ 

/i ( 7 ) = /*(t) = / exp [- 7 ^] Aexp [-Xt] dt = —— (12) 

Ja 7 + 

and 

fc*(7) = ^ (13) 

This result is of great importance because structural failures should, in 
fact, be rare, independent events. Then, the Poisson intensity A can be 
replaced by the so-called outcrossing rate to be described below - 
even in the locally non-stationary case. Finally, if at an extreme load- 
ing event (e.g. flood, wind storm, earthquake, explosion) failure occurs 
with probability P/(p) and f\{t) and /(t), respectively, denote the den- 
sities of the times between the loading events one obtains by similar 
considerations 

9i-(7,P) = = 1 



For the case treated in eq. (13) we have for stationary Poissonian load 
occurrences: 



h*il,p) = 



fft(7.P) ^/(P)A 

1-5*(7,P) 7 



(15) 



Unfortunately, Laplace transforms are rarely analytic. Taking Laplace 
transforms numerically requires some effort but taking the inverse Laplace 
transform must simply be considered as an numerically ill-posed prob- 
lem. Then, however, one always can resort to the asymptotic result 
which can be shown to be accurate enough for all practical purposes. 

The foregoing results can be generalized to cover multiple mode fail- 
ure, loss of serviceability, obsolescence of the facility and inspection and 
maintenance. Also, the case of non-constant benefit, a case of obsoles- 
cence, or non-constant damage has been addressed. Further develop- 
ments are under way. 



4. Computation of Failure Probabilities and 
Failure Rates 

4.1 Time-invariant Reliabilities 

The simplest problem of computing failure probabilities is given as a 
volume integral 
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where the failure event is F{p) — {/i(x, p) < 0} and the random vec- 
tor X = (Xi, X2 , has joint distribution function jPx(x). Since 
n usually is large and P/(p) is small serious numerical difficulties oc- 
cur if standard methods of numerical integration are applied. However, 
if it is assumed that the density /x(x, p) of Fx(x) exists everywhere 
and /i(x,p) is twice differentiable, then, the problem of computing fail- 
ure probabilities can be converted into a problem of optimization and 
some simple algebra. For convenience, a probability preserving dis- 
tribution transformation U — T“^(X) is first applied [19]. Making 
use of Laplace integration methods [4] one can then show that with 
/i(x,p) =h{T{u),p) =g{u,p) [5], [20] 

Pf{p)= / /x(x,p)c/x= / (pxj{u,p)du 

Jh{x, p)<0 Jg(u,p)<0 

n—1 

-$(-/?) (17) 

i=l 



for 1 < (X) with 

13 = ||u*|| = min{u} for {u : g(u,p) < 0} , (18) 

(^u(u) the multinormal density, $(.) the one-dimensional normal in- 
tegral, 5(0, p) > 0 and i^i the main curvatures of the failure surface 
dF = {^(u, p) = 0} . Of course, it is assumed that a unique ’’critical” 
point u* exists but methods have been devised to also locate and con- 
sider appropriately multiple critical points. In line two the asymptotic 
result is given denoted by second order since the Hessian of p(u, p) = 0 
is involved. The last result represents a first-order result correspond- 
ing to a linearization of ^(u, p) in u* already pointed out by [16]. Very 
frequently this is sufficiently accurate in practical applications. 

4.2 Time- variant Reliabilities 

Much more difficult is the computation of time- variant reliabilities. 
Here, the question is not that the system is in an adverse state at some 
arbitrary point in time but that it enters it for the first time given that 
it was initially at time t = 0 in a safe state. The problem is denoted by 
first passage problem in the engineering literature. But exact results for 
distributions of first passage times are almost inexistent. However, good 
approximations can be obtained by the so-called outcrossing approach 
[13]. The outcrossing rate is defined by 

^^(t) = lim r + A) = 1) 

a->oA ^ ^ ^ ^ 



(19) 




149 



Risk Control and Optimization for Structural Facilities 

or for the original vector process 

u+{r) = |im ipi({X(r) € F} D {X(r + A) € P}) (20) 

One easily sees that the definition of the outcrossing rate coincides for- 
mally with the definition of the renewal intensity. The counting process 
N{.) of outcrossings must be a regular process [12] so that the mean 
value of outcrossings in [0, t] is given by 

E[N{t)] = [ i/+(r)dT (21) 

Jo 

One can derive an important upper bound. Failure occurs either if 
X(0) € F or N{t) > 0. Therefore [28] 

Pf{t) = 1 - P(X(r) e P) for all r e [0, t] 

= P({X(0) 6 P} U {N{t) > 0}) 

= P(X(0) G P) + P(N{t) > 0) - P({X(0) G P} n {N{t) > 0}) 

< P(X(0) G P) -t- P{N{t) > 0) 

< P(X(0) eV) + P[iV(t)] (22) 

If the original process is sufficiently mixing one can derive the asymptotic 
result [13]: 

P/(t)~l-exp[-P[iV(t)]] (23) 

justifying the remarks below eq. (13). A lower bound can also be given. 
It is less useful. 

Consider a stationary vectorial rectangular wave renewal process each 
component having renewal rate Xi and amplitude distribution function 
Fi{x). The amplitudes Xi are independent. Regularity assures that only 
one component has a renewal in a small time interval with probability 
Ai A. Then [9] 

n 

i/+(P)A = P(|J{renewal in [0, A]} n {X^ G P} n {X+ G P}) 

2=1 

n 

= 5]AA,P({XiGP}n{X+GP}) 

2=1 

n 

= Ai[P(X+ G P) - P({Xi G P)} n {X+ G P})] (24) 

2=1 

denotes the process X before and X^the process after a jump 
of the th component. If the components are standard normally dis- 
tributed and the failure domain is a half-space F = {a^u + /? < 0} one 
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Figure 1. Outcrossings of a vectorial rectangular wave renewal process 



determines 

= E A, [P n {a'^Vf < -/3})] 

i=l 

n n 

= E Ai [$2(i0, -iS; Pi)] < E (25) 

i=l i=l 

where pi = 1 — of is the correlation coefficient of the process before and 
after a jump and $ 2 (-, ■; •) the bivariate normal integral. For general 
non-linear failure surfaces one can show that asymptotically [8] 

n n—1 

u+{F) = E Ai^(-iS) 11(1 - ^ Ki)-'/2; i< /3 ^ oo (26) 

i=l i=l 

with /3 = ||u*|| — min{||u||} for ^(u) < 0 and Ki the main curvatures 
in the solution point u*. This corresponds to the result in eq. (17). 
The same optimization problem as in the time-invariant case has to be 
solved. Rectangular wave renewal processes are used to model life loads, 
sea states, traffic loads, etc.. 

For stationary vector processes with differentiable sample paths it is 
useful to standardize the original process X(t) and its derivative (in 
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mean square) process X(i) = ^X(i) such that £J[U(t)] = E |^U(t) = 

0,R(0) = I where R(r) = E j^U(0)U(r)^j is the matrix of correlation 
functions and t = \t\ — t 2 \. A matrix of cross correlation functions be- 
tween U(t) and U(t), R(t) = E ^U(0)U(r)^ , as well as of the deriva- 
tive process R(r) = E ^U(0)U(r)^j also exists. The general outcrossing 
rate is defined by [38], [3] 



v+{t) = 



lim 

At — )-0 



P({U(t)G A(aP(t))}n {f/ivW > 5P(t)} in [r < t < T -F Arj) 

At 

(27) 



where = n^(u,t)U(t) the projection of U(t) on the normal 

n(u,t) = — Qf(u,t) of dF{t) in (u,t). A{dF(t)) is a thin layer around 
dF{t) with thickness ('Uw(^) ~ dF{t))Ar. Hence, it is: 



P({U(t) € A{dF{t))} Pi |f/Ar(t) > ap(t)| in [r < t < T -f Arj) 



(Pn+l{u,UN ,t)dudUN 



-I / 

JA{dF{t)) Ju 

/ 

JdF{t) Ju 

In the stationary case one finds with dF = g'(u) = 0 



'A(dF{t)) JUN(t)>dF{t) 

{iiN - dF{t))(pn+i{u,UM,t)ds{u)duN (28) 

dF{t) JUN{t)>dF(t) 



r POO 

-{dF)= / UN(Pn+l{u,UM)duNds{u) 

JdF Jo 

^ = u)(^y^(u)duA7^^«5(u) 

JdF Jo 

= J^^EfT [Piv|U = u] <Pniu)ds{u) 

= [ P~[Piv|U = u (pn-i{u,p{u))T{u)du 



(29) 



where Un = p(u) = g~^{ui^U 2 ^ ...•,Un-i) a parameterization of the sur- 
face and T(u) the corresponding transformation determinant. 

Explicit results are available only for special forms of the failure sur- 
face. For example, if it is a hyperplane 
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Figure 2. Outcrossing of a vectorial differentiable process 



9F = < ^ OLiUi + ^ = 0 



(30) 



i=l 



the outcrossing rate of a stationary standardized Gaussian process is 
[51]: 



v^{dF) = E 



Un 



fm = 



(31) 



with = a^R(r)a.. An asymptotic result for general non-linear sur- 
faces has been derived in [7]: 



iy+{dF) = E[(^ “ 

uj^ = n(u*)^ [r( 0) +R(0)^G(u*)R(0)| n(u*) 



n— 1 



- 1/2 



with 



(32) 



provided that ^(0) > 0 and with R(0) = E U(0)U(0)^ 

G(u-) = |v9(uT‘|^;U = l,....n} 



and 
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Here again we have /? == ||u*|| = min{||u||} for ^(u) < 0 and Ki are 
the main curvatures of dF in the solution point u*. Differentiable pro- 
cesses are used to model the turbulent natural wind, wind waves and 
earthquake excitations but also the output of dynamical systems. 

Exact or approximate results have also been obtained for non-gaussian 
rectangular wave processes with or without correlated components [34], 
certain non-gaussian differentiable processes [14] and a variety of non- 
stationarities of the processes or the failure surfaces [35]. If one is not 
satisfied with the (asymptotic) approximations one can apply impor- 
tance sampling methods in order to arrive at an arbitrarily exact result. 
Due to regularity of the crossings one can combine rectangular wave 
and differentiable processes. The processes can be intermittent [46], [22]. 
This allows the modelling of disturbances of short to very short duration 
(earthquakes, explosions). Such models have also been extended to deal 
with occurrence clustering [55], [45]. 

It is remarkable that the ’’critical” point u*, i.e. the magnitude of 
plays an important role in all cases as in the time-invariant case. It must 
be found by a suitable algorithm. Sequential quadratic programming 
algorithms tuned to the special problem of interest turned out to solve 
the optimization problem reliably and efficiently in practical applications 
[!]• 

However, it must be mentioned that in time-variant reliability more 
general models, e.g. renewal models with non-rectangular wave shapes, 
filtered Poisson process models, etc. can be easily formulated but hardly 
made practical from a computational point of view. 

5. The Value of Human Life and Limb in the 
Public Interest 

Two questions remain: a. Is it admissible to optimize benefits and 
cost if human lives are endangered and b. can we discount the ’’cost 
of human lives” ? First of all, modern approaches to these questions do 
not speak of a monetary value of the human life but rather speak of the 
cost to save lives.. Secondly, any further argumentation must be within 
the framework of our moral and ethical principles as laid down in our 
constitutions and elsewhere. We quote as an example a few articles from 
the BASIC LAW of the Federal Republic of Germany: 



■ Article 2: (1) Everyone has the right to the free development of his 
personality ...(2) Everyone has the right to life and to inviolability 
of his person 
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■ Article 3: (1) All persons are equal before the law. (2) Men and 
women have equal rights. (3) No one may be prejudiced or favored 
because of his sex, his parentage, his race, his language, his home- 
land and origin, his faith or his religious or political opinions. 

Similar principles are found in all modern, democratic constitutions. 
But H. D. Thoreau (1817-1862 p.Chr.) realistically says about the value 
of human life: ” The cost of a thing is the amount of what I will call life 
which is required to be exchanged for it, immediately or in the long run. 
... [29], 

Can these value fixings be transferred to engineering acceptability cri- 
teria? This is possible when starting from certain social indicators such 
as life expectancy, gross national product (GNP), state of health care, 
etc.. Life expectancy e is the area under the survivor curve 5(a) as a 
function of age a, i.e. e = S{a)da. A suitable measure for the qual- 
ity of life is the GNP per capita, despite of some moral indignation at 
first sight. The GNP is created by labor and capital (stored labor). It 
provides the infrastructure of a country, its social structure, its cultural 
and educational offers, its ecological conditions among others but also 
the means for the individual enjoyment of life by consumption. Most im- 
portantly in our context, it creates the possibilities to ’’buy” additional 
life years through better medical care, improved safety in road traffic, 
more safety in or around building facilities or from hazardous technical 
activities, etc.. Safety of buildings via building codes is an investment 
into saving lives. The investments into structural safety must be effi- 
cient, however. Otherwise investments into other life saving activities 
are preferable. In all further considerations only about 60% of the GNP, 
i.e. g 0.6 GNP which is the part available for private use, are taken 
into account. 

Denote by c(r) > 0 the consumption rate at age r and by u{c{r)) 
the utility derived from consumption. Individuals tend to undervalue 
a prospect of future consumption as compared to that of present con- 
sumption. This is taken into account by some discounting. The life time 
utility for a person at age a until she/he attains age t > a then is 

U{a,t) = / a[c(r)]exp 

J a 

t 

u [c{r)] exp [-p{r - a)] dr (33) 

for p{9) = p. It is assumed that consumption is not delayed, i.e. incomes 
are not transformed into bequests, p should be conceptually distin- 
guished from a financial interest rate and is referred to as rate of time 
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preference of consumption. A rate p > 0 has been interpreted as the 
effect of human impatience, myopia, egoism, lack of telescopic faculty, 
etc.. Exponential population growth with rate n can be considered by 
replacing p hy p — n taking into account that families are by a factor 
exp[nt] larger at a later time t > 0. The correction p > n appears always 
necessary, simply because future generations are expected to be larger 
and wealthier, p is reported to be between 1 and 3% for health related 
investments, with tendency to lower values [53]. Empirical estimates 
reflecting pure consumption behavior vary considerably but are in part 
significantly larger [25]. 

The expected remaining present value life time utility at age a (con- 
ditional on having survived until a) then is (see [2] [43] [39] [15]) 

L{a) = E[U{a)]= C {a, t)dt 

Ja t\Cl) 




2 rcLu 

= ^ y U [c(i)] exp [- {p a)] i{t)dt 

= n[c]ed(a,p,n) (34) 



where f{t)dt = ^/r(T)exp — p{T)dr j dt is the probability of dying 



between age t and t dt computed from life tables. The expression 
in the third line is obtained upon integration by parts. Also, a con- 
stant consumption rate c independent of t has been introduced which 
can be shown to be optimal under perfect market conditions [43]. The 
’’discounted” life expectancy e^^(a,p, n) at age a can be computed from 



ed(a,p,n) = 



exp((p — n)a) 
i{a) 



rcLu ■ r 

/ exp - / 
Ja L Jo 



{p(t) + (p- n))dT 



dt (35) 



’’Discounting” affects ed{a^p^n) primarily when /i(r) is small (i.e. at 
young age) while it has little effect for larger p{r) at higher ages. It is 
important to recognize that ’’discounting” by p is initially with respect 
to u [c(r)] but is formally included in the life expectancy term. 

For u [c] we select a power function 

- 1 

u[c] = ^- (36) 

with 0 < g < 1, implying constant relative risk aversion according to 
Arrow-Pratt. The form of eq. (36) reflects the reasonable assumption 
that marginal utility decays with consumption c. u [c] is 
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a concave function since > 0 for g > 0 and < 0 for g < 1. 

The numerical value has been chosen to be about 0.2 (see [43] [15] and 
elsewhere as well as table 2 below). It may also be derived from from the 
work-leisure optimization principle as outlined in [29] where q = 
and w the average fraction of e devoted to (paid) work (see [37] for 
estimates derived from this principle). This magnitude has also been 
verified empirically (see, for example, [25]). For simplicity, we also take 
1 . 

Shepard/ Zeckhauser [43] now define the ’’value of a statistical life” at 
age a by converting eq. (34) into monetary units in dividing it by the 
marginal utility = u' 

" L ' W)“‘ 

= exp[-{p-n){t-a)]e{t)dt 

= ^ed{a,p,n) (37) 

because The ”willingness-to-pay” has been defined as 

WTP{a) = VSL{a) dm (38) 

In analogy to Pandey/Nathwani [31], and here we differ from the related 
economics literature, these quantities are averaged over the age distribu- 
tion /i(a, n) in a stable population in order to take proper account of the 
composition of the population exposed to hazards in and from technical 
objects. One obtains the ’’societal value of a statistical life” 

SVSL = (39) 

with 

rau 

ed{a, p^n)h{a^n)da (40) 

Jo 

and the ’’societal willingness-to-pay” as: 

SWTP = dm (41) 

For p = 0 the averaged ’’discounted” life expectancy ^ is a quantity 
which is about 60% of e and considerably less than that for larger p. In 
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this purely economic consideration it appears appropriate to define also 
the undiscounted average lost earnings in case of death, i.e. the so-called 
’’human capital”: 



HC 



roLu 

Jo 



g{e — a)h{a^ n)da 



(42) 



Table 1 shows the SVSL for some selected countries as a function of p 
indicating the importance of a realistic assessment of p. 
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Table 1: SVSL 10^ in PPP US$ for some countries for various p (from 
recent complete life tables provided by national statistical offices) 



It can reasonably be assumed that the life risk in and from technical 
facilities is uniformly distributed over the age and sex of those affected. 
Also, it is assumed that everybody uses such facilities and, therefore, is 
exposed to possible fatal accidents. The total cost of a safety related 
regulation per member of the group and year is SWTP = — dCy(p) = 
— ^ YJ!i=zi dCy.iip) where r is the total number of objects under discus- 
sion, each with incremental cost dCy^i and N is the group size. For 
simplicity, the design parameter is temporarily assumed to be a scalar. 
This gives: 

-dCy{p) + SV^ dm = 0 (43) 

Let dm be proportional to the mean failure rate dh{p)^ i.e. it is assumed 
that the process of failures and renewals is already in a stationary state 
that is for t — oc. Rearrangement yields 

= -kWSL ( 44 ) 



where 



dm = A;d/i(p), 0 < fc < 1 



(45) 
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the proportionality constant k relating the changes in mortality to changes 
in the failure rate. Note that for any reasonable risk reducing interven- 
tion there is necessarily dh{p) < 0. 

The criterion eq. (44) is derived for safety-related regulations for a 
larger group in a society or the entire society. Can it also be applied 
to individual technical projects? SVSL as well as HC were related to 
one anonymous person. For a specific project it makes sense to apply 
criterion (44) to the whole group exposed. Therefore, the ’’life saving 
cost” of a technical project with Np potential fatalities is: 

Hf = HC kNp (46) 

The monetary losses in case of failure are decomposed into H = Hm+Hf 
in formulations of the type eq. (10) with Hm the losses not related to 
human life and limb. 

Criterion (44) changes accordingly into: 

= -'WSLkNp (47) 

dh[p) 

All quantities in eq. (47) are related to one year. For a particular 
technical project all design and construction cost, denoted by dC(p), 
must be raised at the decision point t = 0. The yearly cost must be 
replaced by the erection cost dC{p) at t = 0 on the left hand side of 
eq. (47) and discounting is necessary. The method of discounting is 
the same as for discharging an annuity. If the public is involved dCyip) 
may be interpreted as cost of societal financing of dC{p). The interest 
rate to be used must then be a societal interest rate to be discussed 
below. Otherwise the interest rate is the market rate, g in SVSL also 
grows approximately exponentially with rate the rate of economic 
growth in a country. It can be taken into account by discounting. The 
acceptability criterion for individual technical projects then is (discount 
factor for discounted erection cost moved to the right hand side): 



^ ^ Cexppi 

dn[p) 7 exp [ 7 tJ exp — 1 



-SVSLkNF- 

7 



(48) 



It must be mentioned that a similar very convincing consideration about 
the necessary effort to reduce the risk for human life from technical 
objects has been given by Nathwani et al. [29] and in [31] producing 
estimates for the equivalent of the constant SVSL very close to those 
given in table 1. The estimates for SVSL are in good agreement with 
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several other estimates in the literature (see, for example, [49], [43]; [52]; 
[24] and many others) which are between 1000000 and 10000000 PPP 
US$ with a clustering around 5000000 PPP US$. 

6. Remarks about Interest Rates 

A cost-benefit optimization must use interest rates. Considering the 
time horizon of some 20 to more than 100 years for most structural 
facilities but also for many risky industrial installations it is clear that 
average interest rates net of in/deflation must be chosen. If the option 
with systematic reconstruction is chosen one immediately sees from eq. 
(14) that the interest rate must be non-zero. For the same equation we 
see that there is a maximum interest rate 7max which Z{p) becomes 
negative for any p 



Tmax — 



m(p)fe-(C(p)+g) 

"i'(p)C'(p) 



(49) 



and, therefore, 0 < 7 < 7max- Also m(p)6 > C(p) + H must be valid 
for any reasonable project which further implies that 6/7 > 1. Very 
small interest rates, on the other hand, cause benefit and damage cost 
to dominate over the erection cost. Then, in the limit 



{C{p)^H) 

m(p) 



(50) 



where the interest rate vanishes. Erection cost are normally weakly 
increasing in the components of p but m(p) grows significantly with p. 
Consequently, the optimum is reached for m(p) ^ 00, that is for perfect 
safety which is not attainable in practice. In other words the interest rate 
must be distinctly different from zero. Otherwise, the different parties 
involved in the project may use interest rates taken from the financial 
market at the decision point t == 0. 

The cost for saving life years also enters into the objective function 
and with it the question of discounting those cost also arises. At first 
sight this is not in agreement with our moral value system. However, 
a number of studies summarized in [32] and [23] express a rather clear 
opinion based on ethical and economical arguments. The cost for saving 
life years must be discounted at the same rate as other investments, 
especially in view of the fact that our present value system should be 
maintained also for future generations. Otherwise serious inconsistencies 
cannot be avoided. 

What should then the discount rate for public investments into life 
saving projects be? A first estimate could be based on the long term 
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growth rate of the GNP. In most developed, industrial countries this 
was a little more than 2% over the last 50 years. The United Nations 
Human Development Report 2000 gives values between 1.2 and 1.9 % 
for industrialized countries during 1975-1998. If one extends the consid- 
eration to the last 120 years one finds an average growth rate ( of about 
1.8% (see table 1). Using data in [47], [27] and the UN Human Devel- 
opment Report 2000 [50] the following table has been compiled from a 
more detailed table. 
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Table 1: Social indices for some developed industrial countries (all 
monetary values are in US$, 1998) 

It is noted that economic growth the first half of the last century was 
substantially below average while the second half was well above average. 
The above considerations can at least define the range of interest rates 
to be used in long term public investments into life saving operations. 
For the discount rates to be used in long term public investments the 
growth theory established by Solow [48] is applied, i.e. 

n + C(1 - 6) < p < 7 < 7max <n-h€( (51) 

where e = 1—q the so-called elasticity of marginal consumption (income). 
There is much debate about interest rates for long term public invest- 
ments, especially if sustainability aspects are concerned. But there is an 
important mathematical result which may guide our choice. Weitzman 
[54] and others showed that the far-distant future should be discounted 
at the lowest possible rate > 0 if there are different possible scenarios 
each with a given probability of being true. 

7. A One-Level Optimization for Structural 
Components 

Let us now turn to the technical aspects of optimization. Cost-benefit 
optimization according to eq. (3) or (10) in principle requires two levels 
of optimization, one to minimize cost and the other to solve the reliability 
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of optimization, one to minimize cost and the other to solve the reliability 
problem. However, it is possible to reduce it to one level by adding the 
Kuhn- Tucker condition of the reliability problem to the cost optimization 
task provided that the reliability task is formulated in the transformed 
standard space. For the task in eq. (3) we have 



Maximize: Z(p) = B* - C{p) - (C7(p) + Hm + Hp) • 

Subject to: 

g{u,p) = 0 

Wj||Vui?(u,p)|| + Vu5(u,p)i||u|| =0; i = 1 

hk(p) <0,k = 

VpC(p) > kSVSLNp^VpPf(p) 

(52) 

where the first and second condition represent the Kuhn- Tucker condi- 
tion for a valid ’’critical” point, the third condition some restrictions on 
the parameter vector p and the forth condition the human life crite- 
rion in eq. (48). Frequently, the term in the objective can be 

replaced by P/(p). The failure probability is 



Pf{p) ^ ^{-p{p))CsoRM (53) 



and we have to require that ||u|| ^ 0 and || Vu^(u, p)|| ^ 0. It is assumed 
that the second-order correction CsoRM is nearly independent of p. In 
fact, at the expense of some more numerical effort, one can use any 
update of the first-order result $(— ^(p)), for example an update by 
importance sampling provided that the result of importance sampling 
is formulated as a correction factor to the first-order result. VpC(p) 
usually must be determined numerically. 

For time- variant problems as in eq. (10) one finds the outcrossing rate 
for a combination of rectangular wave and differentiable processes as: 



z/+(p) 



E 



Aj#(— /3) + Wo 




CsORM 



(54) 



The optimization task is 
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Minimize: Z{p) = ^ ~ C'(p) - (C'(p) + Hm + Hp) ■ 

Subject to: 

5(u,p) = 0 

Wj||Vuff(u,p)|| + Vu5(u,p)i||u|| = 0; i = - 1 

hkip) <0,k = l,...,q 

VpC(p) > kWm:NF^VpU+{p) 

(55a) 

For the case in eq. (15) one replaces by and Vpi^'*‘(p) by 

VpXPfip). 

The optimization tasks in eq. (52) or in (55a) are conveniently per- 
formed by suitable SQP-algorithms (for example, [44], [33]). For both 
formulations eq. (52) and (55a), respectively, gradient-based optimiz- 
ers require the gradients of the objective as well as the gradients of all 
constraints. This means that second derivatives are required in order to 
calculate the gradient of second condition as well as of the human value 
criterion, in particular, the entries into the Hessian of ^(u, p). This 
is also the most serious objection against this form of a one level ap- 
proach. One can, however, proceed iteratively for well-behaved failure 
surfaces. Initially, one assumes a linear or linearized failure surface and 
sets C^soRM ~ Then, all entries are zero. After a first solution 

of problem (52) or (55a) one determines the Hessian once in the solution 

point (u*(^),p(i)) and with it also calculates C^soRM' Problems (52) or 
(55a) are then solved a second time with fixed Hessian G(u*(^), p(^)) and 
so forth. This schemes is repeated until convergence is reached which 
usually is after a few steps. From a practical point of view it is fre- 
quently sufficient to use first-order reliability results and no iteration is 
necessary. 

In closing this section it is important to note that the optimization 
tasks as formulated in eq. (52) and (55a) are among the easiest one 
can think of. In practice safety related design decisions additionally 
include changes in the lay-out, in the structural system or in the main- 
tenance strategy. Optimization is over discrete sets of design alterna- 
tives. Clearly, this is more difficult and very little is known how to do it 
formally except in a heuristic, empirical manner in small dimensions. 

8. Example 

As an example we take a rather simple case of a system where failure 
is defined if the random resistance or capacity is exceeded by the random 
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demand, i.e. the failure event is defined as F = {i? — S{t) < 0}. The 
demand is modelled as a one-dimensional, stationary marked Poissonian 
renewal process of disturbances (earthquakes, wind storms, explosions, 
etc.) with stationary renewal rate A and random, independent sizes 

of the disturbances Si^i = 1,2, Random resistance is log-normally 

distributed with mean p and a coefficient of variation Vr. The distur- 
bances are also independently log-normally distributed with mean equal 
to unity and coefficient of variation Vr. A disturbance causes failure with 
probability: 



Pfip) = $ 



( 



\ 






\ 



^ln((l + F?)(l + V|)) 



(56) 



Thus, the failure rate is XPf{p) and the Laplace transform of the renewal 
density is: 



(57) 

7 

An appropriate objective function given systematic reconstruction then 
is 



Zjp) _ b 
Co iCo 



-S'- 






which is to be maximized. The criterion (62) has the form: 






Some more or less realistic, typical parameter assumptions are: (7o == 
10 ®, Cl = 10^ a = 1.25, Hm = 3- Co, Vr = 0.2, Ks = 0.3, and A = 1 
[1 / year] . The socio-economic demographic data are e = 77, GDP = 
25000, g = 15000, w = 0. 15, Nr = 100, A: = 0.1 so that Hr = HC 
kNp = 5.8 ■ 10® and SVSLkNp = 3.3 • 10^. The value of Nr is chosen 
relatively large for demonstration purposes. Monetary values are in US$. 
Optimization is performed for the public and for the owner separately. 

For the public bs = 0.02Co and 75 = 0.0185 are chosen. Also, we 
take ;^ = 1 for simplicity. In particular, benefit and discount rate are 
chosen such that the public does not make direct profit from an economic 
activity of its members. Optimization including the cost Hr gives p*g = 
4.35, the corresponding failure rate is 1.2 • 10“®. Criterion (48) is already 
fulfilled for pi = 3.48 corresponding to a yearly failure rate of 1.6 • 10“"^ 
but Zs{pi)/Co being already negative. It is interesting to see that in this 
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case the public can do better in adopting the optimal solution rather 
than just realizing the facility at its acceptability limit as pointed out 
already earlier. 

The owner uses some typical values of bo = 0.07(7o and 70 = 0.05 and 
does or does not include life saving cost. If he includes life saving cost the 
objective function is shifted to the right (dashed line). The calculations 
yield Pq — 3.76 and Pq = 4.03, respectively, and the corresponding 
failure rates are 7.1 • 10“^ and 3.2 • 10“^. The acceptability criterion 
limits the owner’s region for reasonable designs. Inclusion of life saving 
cost has relatively little influence on the position of the optimum. 




Figure 3. Objective function of owner and society 



It is noted that the stochastic model and the variability of capacity 
and demand also play an important role for the magnitude and location 
of the optimum as well as the acceptability limit. The specific marginal 
cost (rate of change) of a safety measure and its effect on a reduction of 
the failure rate are equally important. 

9. Conclusions 

Optimization techniques are essential ingredients of reliability-oriented 
optimal designs of technical facilities. Although many technical aspects 
are not yet solved and the available spectrum of models and methods 
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in structural reliability is still limited many practical problems can be 
solved. A special one-level optimization is proposed for general cost- 
benefit analysis. In this paper, however, focus is on some more critical 
issues, for example, ’’what is a reasonable replacement strategy for struc- 
tural facilities?”, ”how safe is safe enough?” and ”how to discount losses 
of material, opportunity and human lives?” . An attempt has been made 
to give at least partial answers. Only if those issues have an answer 
overall optimization of technical facilities with respect to cost makes 
sense. 
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Abstract Traditional models in multistage stochastic programming are directed to 
minimizing the expected value of random optimal costs arising in a mul- 
tistage, non-anticipative decision process under uncertainty. Motivated 
by risk aversion, we consider minimization of the probability that the 
random optimal costs exceed some preselected threshold value. For the 
two-stage case, we analyse structural properties and propose algorithms 
both for models with integer decisions and for those without. Extension 
of the modeling to the multistage situation concludes the paper. 

Keywords: Stochastic programming, mixed-integer optimization. 



1. Introduction 

Stochastic programs with recourse arise as deterministic equivalents 
to random optimization problems. In the present paper the main accent 
will be placed at the two-stage situation, and the most general random 
optimization problems to be considered are random mixed-integer linear 
programs. These are accompanied by a two-stage scheme of alternating 
decision and observation. After having decided on parts of the variables 
in a first stage, the random data infecting the problem are observed, and 
in turn the remaining (second-stage or recourse) variables are fixed. In 
our present analysis two basic assumptions underly this scheme. First, 
and naturally, the first-stage decision has to be taken on a “here-and- 
now” basis, i.e., it must not depend on (or anticipate) the outcome of 
the random data. Secondly, and providing some modeling restriction, 
the first-stage decision does not influence the probability distribution of 



169 




170 



the random data. 

In multistage stochastic programs the above two-stage scheme is ex- 
tended into a finite horizon sequential decision process under uncertainty. 
Again we have to maintain nonanticipativity of decisions, and, so far, 
almost all results concern problems where the decisions do not infiuence 
the probability distribution of the random data. In the final section of 
the present paper we will return to multistage stochastic programs. 
After having sketched the rules for how to make decisions, let us now 
discuss criteria for how to select a “best” decision. In this respect, the 
existing literature on stochastic programs with recourse (cf. the text- 
books [5, 15, 20] and the references therein) almost unanimously suggests 
to start out from expectations of objective function values of the ran- 
dom optimization problem. For two-stage models (in a cost minimiza- 
tion framework) this implies that the deterministic first-stage decision is 
selected such that the expectation of the sum of the deterministic first- 
stage costs and the random second-stage costs (induced by the random 
data and an optimal second-stage decision) becomes minimal. Such a 
criterion has proven useful in many applications. In case the random 
optimization problem is a linear program without integer requirements, 
the resulting stochastic program with recourse enjoys convexity in the 
first-stage variables. This enabled application of powerful tools from 
convex analysis, both for structural investigations and algorithm design 
(cf. [4, 5, 15, 20, 32]). 

In the present paper, we will discuss recourse stochastic programs where 
the optimization is based on minimizing the probability that the above 
sum of deterministic and random costs exceeds a given threshold value. 
Such models provide an opportunity to address risk aversion in the 
framework of recourse stochastic programming. 

The proposal to replace the usual expectation-based objective function 
in recourse stochastic programming by a probability objective seemingly 
dates back to Bereanu [2] and, hitherto, has not been elaborated in much 
detail. Reformulating the stochastic program by adding another variable 
and including level sets of the objective into the constraints leads to a 
chance constrained stochastic program which is nonconvex in general. 
We will see that, along this line, some structural knowledge on chance 
constraints (cf. [5, 15, 16, 20, 29]) reappears in the structural analy- 
sis of our models. Algorithmically, we will view several well-established 
techniques from a fresh perspective. Among them there are cutting 
planes from convex subgradient optimization, Lagrangian relaxation of 
mixed-integer programs, and decomposition techniques for block-angular 
stochastic programs. 

The paper is organized as follows. In Section 2 we formalize the mod- 
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eling outlined above, collect some prerequisites, and compare with the 
usual expectation-based modeling in recourse stochastic programming. 
Section 3 is devoted to structural results. In Section 4 we present some 
first algorithmic approaches. Separate attention is paid to models with- 
out integer decisions since they allow for an algorithmic shortcut. As 
already announced, the final section will discuss the extension of our 
modeling to multistage stochastic programs. 



2. Modeling 

Consider the following random mixed-integer linear program 

min c^x + (^y + 

s.t. Tx + Wy + W'y' - (1) 

X G X, y G , y' G IR^ 

We assume that all ingredients above have conformal dimensions, that 
W, W' are rational matrices, and that X C is a nonempty closed 
polyhedron. Integer requirements to components of x are formally pos- 
sible but will not be imposed for ease of exposition. For the same reason, 
randomness is kept as simple as possible by claiming that only the right- 
hand side h{u) G is random, i.e., a random vector on some proba- 
bility space {Pt^A^F). Decision variables are divided into two groups: 
first-stage variables x to be fixed before and second-stage variables (y, y') 
to be fixed after observation of h{uj). 

Let us denote 



m ■= mm{q^y + q'^y' : Wy + W'y' = t, y e Zf, y' € R^'}. (2) 

According to integer programming theory ([19]), this function is real- 
valued on provided that W{Z!^) -f and {u G : 

W^u < y, < y'} 0 which, therefore, will be assumed through- 

out. 

The classical expectation-based stochastic program with recourse now is 
the optimization problem 

min^J {c^x -\- ^{h{uj) — Tx)) F{d(jj) : a; G x|. (3) 

The recourse stochastic program with probability objective reads 

min|jP({a; G ft : c^x + — Tx) > ipo}) : rr G x| (4) 

where (po E F denotes some preselected threshold (some ruin level in 
a cost framework, for instance). For convenience, we will call (3) the 
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expectation-based and (4) the probability-based recourse model. In do- 
ing so, we are well aware of the fact that, of course, (4) is expectation- 
based too, if probabilities are understood as expectations of indicator 
functions. 

We will see in a moment, that both (3) and (4) are well-defined non- 
linear optimization problems. Their objective functions are denoted by 
Q]e{x) and Qjp{x)^ respectively. To detect their structure, the function 
$ is crucial, which arises as a value function of a mixed-integer linear 
program. From parametric optimization ([1, 6]) the following is known 

Proposition 2.1 Assume that W{Zf) + and {u e 

iR« : W'^u < q, W'^u < q'} ^ 0. Then it holds 

(i) $ is real-valued and lower semicontinuous on IR^ , 

(a) there exists a countable partition = U^iTi such that the 

restrictions o/$ toTi piecewise linear and Lipschitz continuous 
with a uniform constant L > 0 not depending on i, 

(Hi) each of the sets % has a representation Ti = {U + lC} \ U^^i{tij + 
K] where 1C denotes the polyhedral cone W'{1R^') and U^tij are 
suitable points from IR^ , moreover, N does not depend on i, 

(iv) there exist positive constants ^,7 such that — $(^2)! < /3\\ti — 
^ 2 II +7 whenever t\,t 2 G . 

In case m = 0, i.e., if there are no integer requirements in the sec- 
ond stage, $ becomes the value function of a linear program. Under 
the assumptions of Proposition 2.1, $ is real- valued on IR^. By linear 
programming duality it is convex, piecewise linear, and adopts a repre- 
sentation 

= max djt 

where . . . ,dj are the vertices of {u G : W'^u < q'}, which is a 
compact set in this case. 

As an immediate conclusion we obtain, that, without integer require- 
ments in the second stage, 1 — Qip{x) coincides with the probability of a 
closed polyhedron, providing a direct link to chance constrained stochas- 
tic programming ([5, 15, 20]). 

Before we will turn our attention to Qip{x), we review some properties 
of Qie{x)- For convenience we denote by p the image measure IP o h~^ 
on . Without integer requirements (m = 0), convexity of $ extends 
to Qje under mild conditions. A standard result of stochastic linear 
programming reads 
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Proposition 2.2 Assume fh = 0, = R\{ueM^ : W^u < 

q'} ^ 0, and ||/j|| n{dh) < oo. Then Qje : ^ R is a real-valued 

convex function. 

As already mentioned in the introduction, convexity has been ex- 
ploited extensively in stochastic linear programming. For further read- 
ing we refer to the textbooks [5, 15, 20]. The remaining models, both 
expectation- and probability-based, to be discussed in the present paper 
enjoy convexity merely in exceptional situations. Straightforward exam- 
ples (cf. e.g. [35]) confirm that convexity in (3) is lost already for very 
simple models as soon as integer requirements enter the second stage. 
In [33] the following is shown. 

Proposition 2.3 Assume that W{2Zf) + W'{1R^') - {u G : 
W^u < W'^u < q'} ^ 0, and fj^s ||/i|| hi{dh) < oo. Then it holds 

(i) Qje ‘ ^ M is a real-valued lower semicontinuous function, 

(a) if la has a density, then Qje is continuous on IRJ^. 

3. Structure 

To analyse the structure of Qip we introduce the notation 

M{x) ~ {helR^ : c^x + ^{h - Tx) > ipo], x G 

By Y\miidx^-^xM{xri) and limsup^,^^^, we denote the (set theo- 

retic) limes inferior and limes superior, i.e., the set of all points belonging 
to all but a finite number of the sets M{xn)^ n G W, and to infinitely 
many of the sets M{xn), respectively. Moreover, we denote 

Me(x) := {h G JR^ : c^x + ^(h — Tx) = cpo}, 

Md(x) ~ {h e ^ is discontinuous at h — Tx}. 

Note that, by Proposition 2.1, both Me{x) and Md{x) are measurable 
sets for all x G IR^. 

Lemma 3.1 For all x G IR^ there holds 

M{x) C liminfM(a;^) C limsupM(a;^) C M{x) U Me{x) U Md{x). 

Xn^X Xn^X 

Proof: Let h G M{x). The lower semicontinuity of $ (Proposi- 
tion 2.1) yields 

limini{c^Xn -\- ^{h — Txji)) > c^x + ^{h — Tx) > (po. 

Xn ^X 
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Therefore, there exists an rio € iV such that c^Xn + ^{h — Tx^) > ipo 
for all n > no, implying h € M{xn) for all n > Uq- Hence, M{x) C 
liminfa;„_>a;M(x„). 

Let h G limsupa-^^a; M{xn) \ M{x). Then there exists an infinite subset 
W of IV such that 

c^rr„ + $(h — Txn) > e N and (Fx + $(h - Tx) < (po- 

Now two cases are possible. First, $ is continuous at h — Tx. Passing to 
the limit in the first inequality then yields that c^x + $(/i — Tx) > (po, 
and h G Mg{x). Secondly, $ is discontinuous at h — Tx. In other words, 
h G Md{x). □ 

Proposition 3.2 Assume that W{Zf) + W{]R^') = IR^ and {u G 
iR" : W'^u < q, W'^u < q'} / 0. Then Qjp : ^ M is a real- 

valued lower semicontinuous function. 

If in addition /a{Me{x) U Md{x)) = 0; then Qjp is continuous at x. 

Proof: The lower semicontinuity of $ ensures that M{x) is measur- 
able for all X G and hence Qip is real- valued on IR^. By Lemma 3.1 
and the (semi-) continuity of the probability measure on sequences of sets 
we have for all x G IR^ 

Qip{x) = fi{M{x)) < ia{limmf M{xn)) < 

Xjx yx 

< \immi a(Mixri)) = limmfQjp(xn)', 

Xn^X Xn^X 

establishing the asserted lower semicontinuity. In case 

p{Mg{x) yjMd{x)) = 0 
this argument extends as follows 

Q]p{x.) = p{M{x)) = IJ,{M (x) li Me{x)li Md{x)) > 

> /i(limsupM(s„)) > limsup//(M(a;„)) = limsup<5jp(a;„), 

Xn~^X Xn~^X Xn^X 

and Q IP is continuous at x. □ 

Proposition 2.1 now reveals that, for given x G IR^^ both Me{x) and 
Md{x) are contained in a countable union of hyperplanes. The latter 
being of Lebesgue measure zero we obtain that ii{M^{x) U Md{x)) = 0 
is valid for all x G IR^ provided that fj, has a density. This proves 
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Conclusion 3.3 Assume that + W'{]R!jf) = IR^ , {u € : 

W^u < q, W'^ q'} 0; that !JL has a density. Then Qjp is 

continuous on IR^. 

This analysis can be extended towards Lipschitz continuity of Q ip . In 
[36], Tiedemann has shown 

Proposition 3.4 Assume that g, g' are rational vectors, W{Z'^)-\- 
VF'(iR!p') = IR^ , {u G IR^ : W^u < g, W'^^u < g'} ^ 0, and that for any 
nonsingular linear transformation B G L{M^,1R^) all one- dimensional 
marginal distributions of ji o B have bounded densities which, outside 
some bounded interval, are monotonically decreasing with growing abso- 
lute value of the argument. Then Qjp is Lipschitz continuous on any 
bounded subset of IR^. 

From numerical viewpoint, the optimization problems (3) and (4) pose 
the major difficulty that their objective functions are given by multidi- 
mensional integrals with implicit integrands. If h{u) follows a continuous 
probability distribution the computation of Qjp and Qip has to rely on 
approximations. Here, it is quite common to approximate the probabil- 
ity distribution of h{uj) by discrete distributions, turning the integrals in 
(3), (4) into sums this way. In the next section we will see that discrete 
distributions, despite the poor analytical properties they imply for Qjp 
and Qip^ are quite attractive algorithmically, since they allow for integer 
programming techniques. 

Approximating the underlying probability measures in (3) and (4) raises 
the question whether “small” perturbations in the measures result in 
only “small” perturbations of optimal values and optimal solutions. 
Subjective assumptions and incomplete knowledge on = IP o h~^ in 
many practical modeling situations provide further motivation for ask- 
ing this question. Therefore, stability analysis has gained some interest 
in stochastic programming (for surveys see [9, 35]). 

For the models (3) and (4) qualitative and quantitative continuity ofQip, 
Qip jointly in the decision variable x and the probability measure p be- 
comes a key issue then. Once established, the continuity, together with 
well-known techniques from parametric optimization, lead to stability 
in the spirit sketched above. In the present paper, we will not pursue 
stability analysis, but show how to arrive at qualitative joint continuity 
of Qip . For continuity results on Q ip refer to [14, 24, 30, 33, 34], for 
extensions towards stability to [35] and the references therein. 

For the rest of this section, we consider as a function mapping from 
^ p(jR-5j to IR. By V{IR^) we denote the set of all Borel probability 
measures on IR^ . While IR^ is equipped with the usual topology, the set 
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V{Mf) is endowed with weak convergence of probability measures. This 
has proven both sufficiently general to cover relevant applications and 
sufficiently specific to enable substantial statements. A sequence {/^n} 
in V{]R^) is said to converge weakly to /i G written Hn /i, if 

for any bounded continuous function g : -> JR we have 

/ / giOfJ'idO as n-^oo. (5) 

J]R^ JlR^ 

A basic reference for weak convergence of probability measures is Billings- 
ley’s book [3]. 

Proposition 3.5 Assume that W{Zf) + = IR^ and [u € 

]R^ : < q, W'^u < q'} ^ 0. Let n G V{M^) be such that 

n{Me{x) U Md{x)) = 0. Then Qp : x V{M^) M is continuous 

at (x, ji). 

Proof: Let — > x and fiji 1^ be arbitrary sequences. By 

XniX • ^ {O 5 1 } we denote the indicator functions of the sets 

M{xn)^ M{x)^n G IN. In addition, we introduce the exceptional set 

E {/i G JR*^ : -> such that Xn(^n) 7 ^ x(^)}- 

Now we have E C Me{x) U Md{x). To see this, assume that h G 
{Me{x)U Md{x)y = {Me{x)yn{Md{x)y where the superscript c denotes 
the set- theoretic complement. Then $ is continuous at /i — Tx^ and 
either c^x + $(/i — Tx) > cpo or c^x + $(/i — Tx) < ipo- Thus, for 
any sequence hn h there exists a.n Uq ^ IN such that for all n > Uq 
either c^Xn + ^{hn — Txn) > or — Txn) < (fo- Hence, 

Xn{hn) -> x{h) as hn h, implying h G E^. 

In view of C Me{x) U Md{x) and /j,{Me{x) U Md{x)) = 0 we obtain 
that ia{E) = 0. A theorem on weak convergence of image measures 
attributed to Rubin in [3], p. 34, now yields that the weak convergence 
firi implies the weak convergence g^n^Xn^ M ° 

Note that o Xn^it^ ° ^ ^ are probability measures on {0, 1}. 

Their weak convergence then particularly implies that 

gn°XnHm) ^ 

In other words, jdn{M{xn)) — g,{M{x)) or Qjp{xn,lin) — ^ Qip{x,ijl). 

□ 

As done for the expectation-based model (3) in [33], continuity of 
optimal values and upper semicontinuity of optimal solution sets of the 
probability-based model (4) can be derived from Proposition 3.5. 
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Remark 3.6 (probability-based model without integer decisions) 
Without integer second- stage variables the set Md{x) is always empty, 
and Propositions 3.2 and 3.5 readily specify. A direct approach to these 
models including stability analysis and algorithmic techniques has been 
carried out in [23]. Lower semicontinuity of Qjp in the absence of integer 
variables can already be derived from Proposition 3.1 in [29], a statement 
concerning chance constrained stochastic programs. Some early work on 
continuity properties of general probability functionals has been done by 
Raik ([21, 22], see also [16, 20]). 

4. Algorithms 

In the present section we will review two algorithms for solving the 
probability-based recourse problem (4) provided the underlying measure 
p is discrete, say with realizations hj and probabilities TTj,j = 1, . . . , J. 
The algorithms were first proposed in [23] and [36], respectively, where 
further details can be found. 



4.1 Linear Recourse 

We assume that there are no integer requirements to second-stage 
variables which is usually referred to as linear recourse in the literature. 
Suppose that p is the above discrete measure and consider problem (4) 
with 

:= min{q^y : Wy >t,ye IR^ }. (6) 

For ease of exposition let X C IR^ be a nonempty compact polyhedron. 
Let e G denote the vector of all ones and consider the set 



D ~ {{u,Uo) E : 0 <u < e, 0 <Uq < 1, W'^u — UqQ < 0} 

together with its extreme points {dk,dko)’>k = 1,...,FT. Furthermore, 
consider the indicator function 




h G M{x) 
otherwise. 



(7) 



The key idea of the subsequent algorithm is to represent x by 9' binary 
variable and a number of optimality cuts which enables exploitation of 
cutting plane techniques from convex subgradient optimization. The 
latter have proven very useful in classical two-stage linear stochastic 
programming, see e.g. [4, 32]. 
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Lemma 4.1 There exists a sufficiently large constant Mq > 0 such that 
problem (4) can be equivalently restated as 

J 

: {hj - Tx)^ + {c^ x - ipo)dko < Mo9j, 

xex, 9je{o,i}, k = i,...,K, j = i,...,J}.(8) 

Proof: For any x e X and any j G {1, . . . , J} consider the feasibility 
problem 

min {e^t-\-to : Wy+t>hj—Tx^q^y — to<ipo — c^x} (9) 
and its linear programming dual 

mdiyi{{hj—Tx)^u+{c^x—ipo)uo : 0 < < e, 0 < iXo < 1, W^u—Uoq < 0}. 

Clearly, both programs are always solvable. Their optimal value is equal 
to zero, if and only if = 0. In addition, D coincides with the 

feasible set of the dual. If Mq is selected as 

Mo max {{hj - Tx)^^dk + {c^x - (fo)dko}, 

then, for any a; G X, the vector (x, 9) with = 1, j = 1, . . . , J is feasible 
for (8). 

If x(a;, hj) = 1 for some x E X and j G {1, . . . , J}, then there has to 
exist some /j G {1, . . . ^K} such that 

{hj - Tx)^dk + {c^x - (fo)dko > 0. 

Hence, given a; G X, — 0 is feasible in (8) if and only if hj) == 0. 
Therefore, (8) is equivalent to mm{J2j^i7rjx{xffij) ’ x G X}. □ 



The algorithm progresses by sequentially solving a master problem 
and adding violated optimality cuts generated through the solution of 
subproblems (9). These cuts correspond to constraints in (8). Assuming 
that the cuts generated before iteration u correspond to subsets C 
{1, . . . , X} the current master problem reads 

J 

min nj9j : {hj—Tx)'^dk + {c^x — (fo)dko ^ Mo9j, 

xex, eje{o,i}, keJCu, (lo) 

The full algorithm proceeds as follows. 
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Algorithm 4.2 

Step 1 (Initialization) : Set y — and /Co = 0 . 

Step 2 (Solving the master problem): Solve the current master problem 
(10) and let be an optimal solution. 

Step 3 (Solving subproblems): Solve the feasibility problem (9) for x = 
x^ and all j G {1,..., J} such that Oj = 0. Consider the following 
situations: 

1 If all these problems have optimal value equal to zero, then the 
current x^ is optimal for (8). 

2 If some of these problems have optimal value strictly greater than 
zero, then, via the dual solutions, a subset {dk,dko)->k E 1C C 
{1, . . . , K} of extreme points of D is identified. The corresponding 
cuts are added to the master. 

Set /Ci,+i ~ ICylllC and y u + 1; go to Step 2. 

The algorithm terminates since D has a finite number of extreme 
points. For further details on correctness of the algorithm and first 
computational experiments we refer to [23]. 

4.2 Linear Mixed-Integer Recourse 

In the present subsection we allow for integer requirements to second- 
stage variables. Again we assume that X C IRI^ is a nonempty compact 
polyhedron and that /i is the discrete measure introduced at the begin- 
ning of the present section. We consider problem (4) with 

^{t) := mm{q^y : Wy > t, y e Y}. (11) 

For notational convenience we have integrated the former vector (y, y') 
into one vector y now varying in T x IR^ . Accordingly, the 

former (y, q') and (W, W') are integrated into q and W. To be consistent 
with Subsection 4.1 we have inequality constraints in (11). 

Lemma 4.3 There exists a sufficiently large constant M\ > 0 such that 
problem (4) can be equivalently restated as 

J 

min{y^ TTjOj : Wyj > hj — Tx, q^yj + c^x — (po ^ Mi6j, 

x,y,e 

x€X, yjEY, %6{0,1}, j = (12) 

Proof: We choose M\ by 

Ml := supjc^a; + ^{hj — Tx) : a; € X, j = 1, . . . , J}. 
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To see that this supremum is finite, recall the compactness of X and 
the general assumptions on $ in the paragraph following formula (2). 
Part (iv) of Proposition 2.1 then confirms that ^{hj — Tx) remains 
bounded if x and j vary over X and {1, . . . , J}, respectively. 

The selection of M\ guarantees that for any x ^ X and yj G Y such that 
Wy^ > hj - Tx the selection 9j = 1 is feasible. 

Given x, the selection 6j = 0 is feasible if and only if there exists a 
yj G Y fulfilling Wyj > hj — Tx and c^x + q^yj < (fo- The latter holds 
if and only if c^x + ^{hj — Tx) < (po which is equvalent to hj) = 0. 
This proves that (12) is equivalent to min{J2j=i hj) : x G X}. □ 



Compared with problem (8), problem (12) again arises by representing 
the indicator function x from (7) by a binary variable. Lacking duality, 
however, prevents the usage of optimality cuts such that minimization 
with respect to y has to be carried out explicitly in (12). Hence, (8) is a 
variant of (12) where the linear programming nature of the second stage 
enables an algorithmic shortcut. 



Problem (12) is a mixed-integer linear program that quickly becomes 
large-scale in practical applications. General purpose mixed-integer lin- 
ear programming algorithms and software fail in such situations. As an 
alternative, we present a decomposition method based on Lagrangian 
relaxation of nonanticipativity. This decomposition method for block- 
angular stochastic integer programs has been elaborated for the first 
time in [7] for the expectation-based model (3). 



Introduce in (12) copies Xj^j = 1, . . . , J, according to the number of 
scenarios, and add the nonanticipativity constraints x\ — . . . — xj {ox an 
equivalent system), for which we use the notation ~ 0 with 

proper (/, n)— matrices Hj^j = 1, . . . , J. Problem (12) then becomes 



min {y^ 'KjOj : Txj + Wyj > hj^ c^Xj + q^yj — Mi9j < (po^ 



x,y,e 



e X, yj e Y, 6j € {0, 1}, i - 1, . . . , J, ^ HjXj = 0}. (13) 



This formulation suggests Lagrangian relaxation of the interlinking 
constraints HjXj = 0. For X e we consider the functions 

Lj{xj,yj,0j,X) := njOj + HjXj, j = l,...,J, 
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J 

L{x,y,e,\) := 

i=i 

The Lagrangian dual of (13) then is the optimization problem 

max{D(A) : X e (14) 



where 

J 

D{\) = min{^L j {xj , yj , 9j ,\) \ Txj + Wyj > hj, 
i=i 

c^Xj + q^yj - MiOj < tpo, 

Xj 6 X, yj € y, 6j G {0, 1}, j = J}. 

For separability reasons we have 

D{\) = (15) 

where 



Dj (A) = min{Lj {xj , yj , Oj , A) : Txj + Wyj > hj , 

c^Xj + q^yj - Mi9j < ipo, (16) 

^ Vj ^ ^ {O 5 !}}• 

D{\) being the pointwise minimum of affine functions in A, it is piece- 
wise affine and concave. Hence, (14) is a non-smooth concave maximiza- 
tion (or convex minimization) problem. Such problems can be tackled 
with advanced bundle methods, for instance with Kiwiel’s proximal bun- 
dle method NOA 3.0, [17, 18]. At each iteration, these methods require 
the objective value and one subgradient of D. The structure of D, cf. 
(15), enables substantial decomposition, since the single-scenario prob- 
lems (16) can be tackled separately. Their moderate size often allows 
application of general purpose mixed-integer linear programming codes. 
Altogether, the optimal value zlb of (14) provides a lower bound to the 
optimal value 2 : of problem (12). From integer programming ([19]) it is 
well-known, that in general one has to live with a positive duality gap. 
On the other hand, it holds that zl\d > zlp where zpp denotes the 
optimal value to the LP relaxation of (12). The lower bound obtained 
by the above procedure, hence, is never worse the bound obtained by 
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eliminating the integer requirements. 

In Lagrangian relaxation, the results of the dual optimization often 
provide starting points for heuristics to find promising feasible points. 
Our relaxed constraints being very simple {x\ = ... = a; at), ideas for 
such heuristics come up straightforwardly. For example, examine the 
components, j — 1, . . . , J, of solutions to (16) for optimal or nearly 
optimal A, and decide for the most frequent value arising or average and 
round if necessary. 

If the heuristic yields a feasible solution to (12), then the objective value 
of the latter provides an upper bound ^ for Together with the lower 
bound zlb this gives the quality certificate (gap) z — zld- 
The full algorithm improves this certificate by embedding the proce- 
dure described so far into a branch-and-bound scheme in the spirit of 
global optimization. Let V denote the list of current problems and 
^LD = ^ld{P) the Lagrangian lower bound for P ^ V. The algorithm 
then proceeds as follows. 

Algorithm 4.4 

Step 1 (Initialization): Set z — +oo and let V consist of problem (13). 
Step 2 (Termination): If V — ^ then the solution x that yielded z = 
Q]p{x), cf. (4), is optimal. 

Step 3 (Node selection): Select and delete a problem P from V and 
solve its Lagrangian dual. If the optimal value zld{P) hereof equals 
+00 (infeasibility of a subproblem) then go to Step 2. 

Step 4 (Bounding): If zld{P) > z go to Step 2 (this step can be carried 
out as soon as the value of the Lagrangian dual rises above z). Consider 
the following situations: 

1 The scenario solutions Xj, j = 1, . . . , J, are identical: If Qjp{xj) < 
z then let z = Qip{xj) and delete from V all problems P' with 
zld{P') > Go to Step 2. 

2 The scenario solutions Xj, ] = 1, . . . , J differ: Compute the average 
X = S]/=i '^j^j ond round it by some heuristic to obtain x^. If 
Qip{x^) < z then let z = Qjp{x^) and delete from V all problems 
P' with zld{P') > Go to Step 5. 

Step 5 (Branching): Select a component of x and add two new 
problems to V obtained from P by adding the constraints 
and + 1, respectively (if is an integer component), or 

X(^k) ^ and respectively, where e > Q is a tolerance 

parameter to have disjoint subdomains. Go to Step 3. 

The algorithm works both with and without integer requirements 
in the first stage. It is obviously finite in case X is bounded and all 
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a;— components have to be integers. If x is mixed-integer (or continuous, 
as in the former presentation) some stopping criterion to avoid endless 
branching on the continuous components has to be employed. Some first 
computational experiments with Algorithm 4.4 are reported in [36]. 

5. Multistage Extension 

The two-stage stochastic programs introduced in Section 2 are based 
on the assumptions that uncertainty is unveiled at once and that deci- 
sions subdivide into those before and those after unveiling uncertainty. 
Often, a more complex view is appropriate at this place. Multistage 
stochastic programs address the situation where uncertainty is unveiled 
stepwise with intermediate decisions. 

The modeling starts with a finite horizon sequential decision process un- 
der uncertainty where the decision xt G at stage t G {1,...,T} 

is based on information available up to time t only. Information is 
modeled as a discrete time stochastic process {Ct}T=i on some proba- 
bility space {Pt^A^F) with taking values in The random vector 
then reflects the information available up to time t. 
Nonant icipativity, i.e., the requirement that Xf must not depend on fu- 
ture information, is formalized by saying that xt is measurable with 
respect to the a— algebra At C A which is generated by t = 1, . . . , T. 
Clearly, At C At+i for alH = 1, . . . , T — 1. As in the two-stage case, the 
first-stage decision x\ usually is deterministic. Therefore, A\ = 
Moreover, we assume that At = A. 

The constraints of our multistage extensions can be subdivided into 
three groups. The first group comprises conditions on Xt arising from 
the individual time stages: 

Xt{uj) e Xt , Bt{Ct{i^))xt{uj) > ( 17 ) 

^—almost surely, t — 1, . . . , T. 

Here, Xt C is a set whose convex hull is a polyhedron. In this way, 
integer requirements to components of xt are allowed for. For simplicity 
we assume that Xt is compact. The next group of constraints models 
linkage between different time stages: 

t 

AtTi^t{(^))xr{uj) > gti^t{(^)) iP-almost surely, (18) 

T = 1 

Finally, there is the nonanticipativity of xt^ i. e., 

Xt is measurable with respect to t = 1, . . . , T. 



(19) 




184 



In addition to the constraints we have a linear objective function 

T 

t=l 

The matrices AtT{-)^Bt{.) as well as the right-hand sides dt{.)^gt{-) and 
the cost coefficients ct{.) all have conformal dimensions and depend 
affinely linearly on the relevant components of 

The decisions Xt are understood as members of the function spaces 
w4, iP; t The constraints (17), (18) then impose 

pointwise conditions on the whereas (19) imposes functional con- 
straints, in fact, membership in a linear subspace of x^iLoo(0,yl,iP; 
iR’^^), see e.g. [31] and the references therein. 

Now we are in the position to formulate the multistage extensions to 
the expectation- and probability-based stochastic programs (3) and (4), 
respectively. 

The multistage extension of (3) is the minimization of expected minimal 
costs subject to nonanticipativity of decisions: 

T 

f : (17), (18)}F(fl!a;) (20) 

X fulfilling (19) Jfl x{lj) 

To have the integral in the objective well-defined, the additional assump- 
tion ^ Li(Jl, .4., JP; iR^^), t = 1,...,T, is imposed in model (20), see 
[31] for further details. 

The multistage extension of (4) is the minimization of the probability 
that minimal costs do not exceed a preselected threshold (po G M. Again 
this minimization takes place over nonanticipative decisions only: 

T 

t ^ ^ = (1^)’ (18)} > <^o})(21) 

X fulfilling (19) \ x{cj) “ J J 

The minimization in the integrand of (20) being separable with respect 
to cj G ri, it is possible to interchange integration and minimization. 
Then the problem can be restated as follows: 

U T 

F{doj) : a; fulfilling (17), (18), (19)|. (22) 

4=1 

Extending the argument from Lemma 4.3 we introduce an additional 
variable 0 G L^oiO,, A, IP; {0,1}) as well as a sufficiently big constant 
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M > 0. Then problem (21) can be equivalently rewritten as: 

U T 

6{oj)lP{doj) - (fo < M-0{u)), 

^ t=l 

0{uj) e {0, 1} F-a.s., a; fulfilling (17), (18), (19)}. (23) 

Problem (22) is the well-known multistage stochastic (mixed-integer) 
linear program. Without integer requirements, the problem has been 
studied intensively, both from structural and from algorithmic view- 
points. The reader may wish to sample from [4, 5, 8, 10, 13, 15, 20, 
25, 26, 27, 32] to obtain insights into these developments. With integer 
requirements, problem (22) is less well-understood. Existing results are 
reviewed in [31]. 

To the best of our knowledge, the multistage extension (21) has not 
been addressed in the literature so far. Some basic properties of (21), 
(22) regarding existence and structure of optimal solutions can be de- 
rived by following arguments that were employed for the expectation- 
based model (22) in [31]. Their mathematical foundations are laid out 
in [11, 12, 28]. The arguments can be outlined as follows: 

Problem (22) concerns the minimization of an abstract expectation over 
a function space, subject to measurability with respect to a filtered se- 
quence of a— algebras. Theorems 1 and 2 in [12] (whose assumptions 
can be verified for (22) using statements from [11, 28]) provide suffi- 
cient conditions for the solvability of such minimization problems and 
for the solutions to be obtainable recursively by dynamic programming. 
The stage- wise recursion rests on minimizing in the t— th stage the reg- 
ular conditional expectation (with respect to At) of the optimal value 
from stage t + 1. When arriving at the first-stage, a deterministic opti- 
mization problem in x\ remains (recall that A\ = {0,fi}). Its objective 
function Q^{xi) can be regarded the multistage counterpart to the func- 
tion Q]p(x) that we have studied in Section 3. 

Given that (22) is a well-defined and solvable optimization problem. 
Sections 3 and 4 provide several points of departure for future research. 
For instance, unveiling the structure of Q'p{xi) may be possible by 
analysing the interplay of conditional expectations and mixed-integer 
value functions. Regarding solution techniques, the extension of Algo- 
rithm 4.4 to the multistage situation may be fruitful. Indeed, it is well- 
known that the nonanticipativity in (19) is a linear constraint. With 
a discrete distribution of ^ this leads to a system of linear equations. 
Lagrangian relaxation of these constraints produces single-scenario sub- 
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problems, and the scheme of Algorithm 4.4 readily extends. However, 
compared with the two-stage situation, the relaxed constraints are more 
complicated such that primal heuristics are not that obvious, and the 
dimension I of the Lagrangian dual (14) may require approximative in- 
stead of exact solution of (14). Further algorithmic ideas for (22) may 
arise from Lagrangian relaxation of either (17) or (18). In [31] this is 
discussed for the expectation-based model (22). 
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Abstract Realistic optimal control problems from flight mechanics axe currently 
solved by sophisticated direct or indirect methods in a fast and reliable 
way. Often one is not only interested in the optimal solution of one 
control problem, but is also strongly interested in the sensitivity of the 
optimal solution due to perturbations in certain parameters (constants 
or model functions) of the process. In the past this problem wcis solved 
by time-consuming parameter studies: A large number of almost similar 
optimal control problems were solved numerically. Sensitivity deriva- 
tives were approximated by finite differences. Recently a new approach, 
called parametric sensitivity analysis, was adapted to the direct solution 
of optimal control processes [3]. It uses the information gathered in the 
optimal solution of the unperturbed (nominal) optimal control problem 
to compute sensitivity differentials of all problem functions with respect 
to these parameters. This new approach is described in detail for an 
example from trajectory optimization. 

Keywords: parametric sensitivity analysis, optimal control, direct methods, trajec- 
tory optimization. 

Introduction 

Realistically modelled optimal control problems can be solved effi- 
ciently and reliably by sophisticated direct and indirect methods (see 
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e.g. the survey articles [1], [10]). Trajectory optimization problems for 
aircrafts and space vehicles usually pose hard challenges for the direct 
and indirect solution algorithms. A couple of direct algorithms have 
proved their ability to solve accurately and reliably trajectory optimiza- 
tion problems in the last decade, such as e.g. SOCS (Betts [2]), GESOP 
(Jansch, Well, Schnepper [9]), DIRCOL (von Stryk [12]) and NUDOC- 
CCS (Biiskens [3]). Trajectory optimization problems use in general 
complicated models of the surrounding atmospheric effects, the perfor- 
mance and consumption of the engines and the ability to maneuver. 
Usually optimal solutions are computed at first for a nominal data set of 
the model. Later huge parameter studies are done for perturbed model 
data. This means that the whole optimization process is started again 
for the huge number of perturbed models. 

We present a new approach of Biiskens [3]: Exploiting already com- 
puted information during the solution of the nominal optimal control 
problem to derive sensitivity information. This substitutes the addi- 
tional solution of perturbed optimal control problems. 

We explain the new approach of parametric sensitivity analysis in 
detail for an example from flight mechanics. The trajectory optimiza- 
tion problem is concerned with minimizing the amount of fuel used per 
travelled range over ground with periodic boundary conditions. It is 
interesting that by periodic trajectories and controls savings in fuel con- 
sumption can be achieved in comparison to the steady-state solution. 

In order to normalize the changing effects of the atmosphere due to the 
weather, one uses data of a reference atmosphere in the computational 
model. Unfortunately there exist a couple of reference atmospheres. 
Additionally one is also interested in realistic changes of the air density 
onto the computed optimal solution. 

We therefore provide not only the nominal solution but also the sen- 
sitivity with respect to the air density as an example. Note that no 
parameter studies are needed. Information gathered during the compu- 
tation of the nominal solution is used. 

Applications to further perturbation parameters in the model are 
straight forward. 

1. A Trajectory Optimiziation Problem 

Aircraft usually use steady-state cruise to cover long distances. It 
is interesting, that these steady-state trajectories are non-optimal with 
respect to minimizing fuel [11]. 

The following optimal control problem from [8], enlarged by a per- 
turbation parameter p, describes the problem of minimizing fuel per 
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travelled range over ground for a realistically modelled aircraft flying in 
a vertical plane. 

State variables are velocity u, flight path angle 7 , altitude h and weight 
W. The range x is used as the independent variable. The lift coeflicient 
Cl and the throttle setting 6 are the control variables. 

For a given value of the perturbation parameter p (nominal value is 
here po = 1) And control functions Cl{x;p) and S{x;p) and the flnal 
range Xf{p) such that the cost functional 

I=[Wo- W{xf)]/xf 



is minimized and the following equations of motion, control constraints 
and boundary conditions are fulfllled. 



dv 

dx 

dj 

dx 

dx 

dW 

dx 



v(0) 



V cos 7 L 



T{h,M) S-D{h,M,CL) 






— sm7 



L(h,M,CL) 

Wq cos 7 



- 1 



tan 7 

-T{h,M) d 

UCOS7 

0 ^ Cl < 



C^L,max 

1 



v{xf), 7(0) = j{xf), h{0) = h{xf), W{0) = Wo 



( 1 ) 



( 2 ) 

(3) 



Model functions are the Mach number M, speed of sound a, thrust 
T, consumption c, lift L, drag jD, air density p. S denotes the constant 
reference area, g denotes the gravitational constant. 



M{v,h) 


= v/a{h) 


a{h) 


r ~3 — 
= aiA L 
V i=0 


T{h,M) 
c{h, M) 


T 

= ci(h) + C 2 (h)M + C 3 (h)M^ + C 4 (h)M^ 
= di(h) + d2(h)M + d3(h)M^ + d4(h)M^ 


D{v,h,CL) 


= p(h) S Cl/2 

= p{h) S [Cdo(M) + ACd{M, 


Cdo{M) 

ACd{M,Cl) 


= a\ arctan[a2 (M — 03)] + 04 
= h(M)Cl + b2{M)Cl + b3{M)Cl + b4{M)Cl 


p{h) 


= p Po exp [/S7 + + / 3 s exp ^ ^ 
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The coefficients of the polynomials bi{M), Ci{h), di{h), and the constants 
Po, ft, cii, ft ttj, Wo, ^min, Cl, max can be found in the [8]. 




x/x{ 



Figure 1. Nominal optimal state 
v(a;;po) 




Figure 3. Nominal optimal state 
h(x]po) 




x/x{ 



Figure 2. Nominal optimal state 
7(x;po) 




Figure 4- Nominal optimal state 
W(x;po) 



As an additional perturbation parameter we use p. A solution by an 
indirect multiple shooting algorithm is presented in [8] for the nominal 
value of po = 1- In these times, before sophisticated direct methods 
were developed, elaborated homotopies were required for the indirect 
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Figure 6. Nominal optimal control 
5{x\po) 

solution. Today these initial estimates for the indirect method can be 
provided easily by direct methods. The figures 1-4 show the optimal 
states and the figures 5-6 show the optimal controls for the nominal value 
of p — po == 1 computed by the direct method NUDOCCCS (Biiskens 

[3]). 

Moreover the direct method NUDOCCCS can compute in a post- 
processing step also the sensitivities of the optimal states f^(^;p), • • • , 

^(rr;p) and the optimal controls ^"(rr;p), ^{x;p) with respect to 
perturbation parameters p. 

2. Parametric Sensitivity Analysis 

The general mathematical approach for a parametric sensitivity anal- 
ysis of perturbed optimal control problems is based on NLP methods: 
The following autonomous perturbed control problem of Mayer-form 
will be referred to as problem OCP(p): 

For a given perturbation parameter p G P find control functions 
u{x]p) and the final time Xf(p) such that the cost functional 

J = 9{y{x{),p) (4) 

is minimized subject to the following constraints 

y'(x) = f{y{x),u{x),p) , X € [0,a;f], 

V'(y(0),y(a;f),p) = 0 

C{y{x),u{x),p) <0 , rE G [OjXf]. 





x/xi 



Figure 5. Nominal optimal control 
Cl{x]Pq) 



( 5 ) 
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Herein y{x) G IR” denotes the state of a system and u{x) G IR'” the 
control with respect to an independent variable x, which is often the 
time. 

In the previously introduced trajectory optimization problem the in- 
dependent variable x denotes the range, the state is given by 
y := (n, 7 , /i, and the control by u := (Cx,, 

The functions y.W^xP : 1R"+'” x P ^ IR”, ^ : IR^'^ x P ^ 

IR^, and C : 1R”+'” x P — )■ IR*^ are assumed to be sufficiently smooth on 
appropriate open sets. The final time X{ is either fixed or free. Note that 
the formulation of mixed control-state constraints C(y{x),u{x),p) < 0 
in (5) includes pure control constraints C{u{x),p) < 0 as well as pure 
state constraints C{y{x),p) < 0. It is well known, that problems of 
form OCP(p) can be solved efficiently by approximating the control 
functions u* ^ u{xi) for given mesh points iCj G [0, Xf], i = 1, . . . , N’ and 
solving the state variables by standard integration methods. This leads 
to approximations y{xi;z,p) w y{xi), z := {u^ , . . . ,u^) of the state at 
the mesh points Xi. For a more detailed discussion please refer to [3]- 
[ 6 ]. 

Therefore the optimal control problem OCP(p) is replaced by the 
finite dimensional perturbed nonlinear optimization problem NLP (p) 



For a given p € P 
mm g{y{xN;z,p),p) s.t. 

i){y{xN\z,p),p) = 0, 

C{y{xi'z,p),u\p)<Q, i = l,...,iV. 



( 6 ) 



Several reliable optimization codes have been developed for solving NLP 
problems (6), like e.g. SQP methods. This idea is implemented e.g. in 
the direct methods SOCS, GESOP, DIRCOL and NUDOCCCS. 

An additional, and to our knowledge unique, feature of NUDOCCCS 
(Biiskens [3]) is the ability to compute accurately the sensitivity differ- 
entials ^{x\pq), ^{x\pq) of the approximations 



y(a;;po + Ap) 
u{x\pq -h Ap) 



y(a;;po) -F ^{x]po) ■ Ap, 
u{x-po) + ^{x-,Po) ■ Ap. 



( 7 ) 



This is done by the following idea: 

Let zq denote the unperturbed solution of NLP(po) for a nominal 
parameter p = po and let denote the collection of active constraints 
in (6). 

L{z,y,p) := g{y{xN;z,p),p) + p7h‘^{z,p) 

is the Lagrangian function with the associated Lagrange multiplier fi. 
Then the following results hold [7]: 
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Solution Differentiability for NLP-problems: Suppose that the 
optimal solution (2ro,/io) for the nominal problem NLP(po) satisfies a 
maximal rank condition for second order sufficient optimality 

conditions and strict complementarity of the multiplier fi. Then the un- 
perturbed solution (zQ^fio) can be embedded into a -family of perturbed 
solutions for NLP(p) with z{po) = zq, l^{Po) — /^o- 

The sensitivity differentials of the optimal solutions are given by the 
formula 

( |(Po) ^ (KV Y' ( \ 

V tiPo) ) \ K 0 ) \ hi ) 

evaluated at the optimal solution. This formula provides good approxi- 
mations for the sensitivity of the perturbed optimal controls at the mesh 
points, i.e. for the quantities ^{xi]po)^ i = Then the state 

sensitivities obtained by differentiating the control-state 

relation in (6) y{xi) = y{xi^z.,p) with respect to the parameter p\ 



^{xf,po) ^ ^{Xf, zo,po)^{po) + ^{xi]zo,po). 



( 9 ) 



The sensitivity differentials of the adjoint variables or objective func- 
tional can be calculated respectively. 




xjxi 

Figure 7. Sensitivity dv/dp{x]po) 




Figure 8. Sensitivity d'y/dp{x]po) 



We return back to the example. In the first step the optimal nomi- 
nal solution is calculated by the code NUDOCCCS of Biiskens [3], see 
figures 1-6. In the second step the sensitivity differentials of the model 
functions (states, controls, adjoint variables, cost functional and further 
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interesting model functions) are calculated from equations (8, 9), see 
figures 7-12. 

These figures provide valuable information for the engineers. Addi- 
tional perturbation parameters can be added to the model. Basically 
only an additional matrix vector multiplication is needed in order to 
compute the sensitivity differentials of the states and controls for each 
component of the perturbation parameter. 





Figure 11. Sensitivity dCi /dp (a;; po) 



Figure 12. Sensitivity dS/dp(x]po) 
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Abstract Standard interior-point algorithms usually show a poor performance 
when applied to multicommodity network flow problems. A recent 
specialized interior-point algorithm for linear multicommodity network 
flows overcame this drawback, and wcis able to efficiently solve large and 
difficult instances. In this work we perform a computational evaluation 
of an extension of that specialized algorithm for multicommodity prob- 
lems with convex and separable quadratic objective functions. As in the 
linear case, the specialized method for convex separable quadratic prob- 
lems is based on the solution of the positive definite system that appears 
at each interior-point iteration through a scheme that combines direct 
(Cholesky) and iterative (preconditioned conjugate gradient) solvers. 
The preconditioner considered for linear problems, which was instru- 
mental in the performance of the method, has shown to be even more 
efficient for quadratic problems. The specialized interior-point algo- 
rithm is compared with the general barrier solver of CPLEX 6.5, and 
with the specialized codes PPRN and ACCPM, using a set of convex 
separable quadratic multicommodity instances of up to 500000 variables 
and 180000 constraints. The specialized interior-point method was, in 
average, about 10 times and two orders of magnitude faster than the 
CPLEX 6.5 barrier solver and the other two codes, respectively. 

Keyw^ords: Interior-point methods, network optimization, multicommodity flows, 
quadratic programming, large-scale optimization. 
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1. Introduction 

Multicommodity flows are widely used as a modeling tool in many 
flelds as, e.g., in telecommunications and transportation problems. The 
multicommodity network flow problem is a generalization of the mini- 
mum cost network flow one where k different items — the commodities — 
have to be routed from a set of supply nodes to a set of demand nodes 
using the same underlying network. This kind of models are usually 
very large and difficult linear programming problems, and there is a 
wide literature about specialized approaches for their efficient solution. 
However most of them only deal with the linear objective function case. 
In this work we consider a specialized interior-point algorithm for multi- 
commodity network flow problems with convex and separable quadratic 
objective functions. The algorithm has been able to solve large and 
difficult quadratic multicommodity problems in a fraction of the time 
required by alternative solvers. 

In the last years there has been a significant amount of research in the 
field of multicommodity flows, mainly for linear problems. The new solu- 
tion strategies can be classified into four main categories: simplex-based 
methods [6, 15], decomposition methods [10, 12], approximation meth- 
ods [13], and interior-point methods [4, 12]. Some of these algorithms 
were compared in [7] for linear problems. 

The available literature for nonlinear multicommodity flows is not so 
extensive. For instance, of the above approaches, only the codes of [6] 
and [12] (named PPRN — nonlinear primal partitioning — and ACCPM — 
analytic center cutting plane method — , respectively) were extended to 
nonlinear (possibly non-quadratic) objective functions. In this work we 
compared the specialized interior-point algorithm with those two codes 
using a set of large-scale quadratic multicommodity problems. The spe- 
cialized interior-point algorithm turned out to be the most efficient strat- 
egy for all the instances. A description and empirical evaluation of addi- 
tional nonlinear multicommodity algorithms can be found in the survey 

[14]. 

The specialized-interior point method presented here is an extension 
for convex and separable quadratic objective functions of the algorithm 
introduced in [4] for linear multicommodity flows. The solution strategy 
suggested for linear problems (i.e., solving the positive definite system at 
each interior-point iteration through a scheme that combines direct and 
iterative solvers) can also be applied to convex and separable quadratic 
multicommodity problems. Moreover, as it will be shown in the com- 
putational results, this solution strategy turned out to be even more 
efficient for quadratic than for linear problems. 
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Up to now most applications of multicommodity flow models dealt 
with linear objective functions. Quadratic multicommodity problems 
are not usually recognized as a modeling tool, mainly due to the lack of 
an efficient solver for them. The specialized interior-point method can 
help to fill this void. The efficient solution of large and difficult quadratic 
multicommodity problems would open new modeling perspectives (e.g., 
they could be used in network design algorithms [9]). 

The structure of the document is as follows. In Section 2 we formulate 
the quadratic multicommodity flow problem. In Section 3 we sketch the 
specialized interior-point algorithm for multicommodity flow problems, 
and show that it can also be applied to the quadratic case. Finally in 
Section 4 we perform an empirical evaluation of the algorithm using a 
set of large-scale quadratic multicommodity flow instances, and three 
alternative solvers (i.e., CPLEX 6.5, PPRN and ACCPM). 

2. The quadratic multicommodity flow problem 

Given a network of m nodes, n arcs and k commodities, the quadratic 
multicommodity network flow problem can be formulated as 



min 



k 



^{{cYx^ + {xY 



Q 



X^) 



subject to 



■ iV 0 . 

0 N . 


• O O 

• o o 




X^ 

x^ 




61 1 
62 


0 0 . 

J 1 . 


. N 0 
. 1 1 _ 




. . 




6^ 

u 



0 < <u^ i = 1 .. .k 

0 < < u. 



(1) 



Vectors € ]R”,i = 1 . . . fc, are the flows for each commodity, while 
G IR^ are the slacks of the mutual capacity constraints. N G is 

the node-arc incidence matrix of the underlying network, and 1 denotes 
the n X n identity matrix, G IR^ are the arc linear costs for each 
commodity and for the slacks, G IR^ and u G IR^ are respectively 
the individual capacities for each commodity and the mutual capacity 
for all the commodities, G IR"^ are the supply/demand vectors at the 
nodes of the network for each commodity. Finally G IR^^^ are the arc 
quadratic costs for each commodity and for the slacks. We will restrict 
to the case where is a positive semidefinite diagonal matrix, thus 




202 



having a convex and separable quadratic objective function. Note that 
(1) is a quadratic problem with rh — km+n constraints and n = {k + l)n 
variables. 

Most of the applications of multicommodity flows in the literature 
only involve linear costs. However, quadratic costs can be useful in the 
following situations: 

■ Adding a quadratic penalty term to the occupation of a line in 

a transmission/transportation network. In this case we would set 
Q'^ — — 1 .. .k. This would penalize saturation of lines, guar- 

anteeing a reserve capacity to redistribute the current pattern of 
flows when line failures occur. 

■ Replacing a convex and separable nonlinear function by its quadratic 
approximation. 

■ Finding the closest pattern of flows x to the currently used 

when changes in capacities/demands are performed. In this case 
the quadratic term would be {x — — x) 

■ Solution of the subproblems in an augmented Lagrangian relax- 
ation scheme for the network design problem [9, 11] 



3. The specialized interior-point algorithm 



The multicommodity problem (1) is a quadratic program that can be 
written in standard form as 



lin I 



mm < c" rr + -x^ Qx 



Ax = X + s = x^s >0 



( 2 ) 



where x,s,u€ IR", Q € R"''" and b £ The dual of (2) is 



max 



f 'T' 1 T' 'T' nn j 

I ^ ~ 2 ^ ~ ^ y ~ + z — w = z^w >0 > , 



(3) 

where y G IR’^ and z^w E IR^. For problem (1), matrix Q is made of 
A; + 1 diagonal blocks; blocks = 1 ... A:, are related to the flows for 
each commodity, while are the quadratic costs of the mutual capacity 
slacks. 

The solution of (2) and (3) by an interior-point algorithm is obtained 
through the following system of nonlinear equations (see [18] for details) 



Txz = jJie — XZe = 0 
Tsw = - SWe = 0 

= b — Ax = 0 

Tc = c — {A^y — Qx + z — w) = 0 
(x, s^z^w) > 0 , 



( 4 ) 
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where e is a vector of I’s of appropriate dimension, and matrices X, Z, 5, 
W are diagonal matrices made from vectors z, s, w. The set of unique 
solutions of (4) for each /i value is known as the central path^ and when 
0 these solutions superlinearly converge to those of (2) and (3) [18]. 
System (4) is usually solved by a damped version of Newton’s method, 
reducing the /i parameter at each iteration. This procedure is known as 
the path-following algorithm [18]. Figure 1 shows the main steps of the 
path-following algorithm for quadratic problems. 

Figure 1. Path-following algorithm for quadratic problems. 



Algorithm Path-following{A^ Q, 6, c, u): 

1 Initialize a; > 0, 5 > 0, y, z > 0, ip > 0; 

2 while (a;, 5, y, 2 :, w) is not solution do 

3 0 = {X-^Z + S~^W + g)-i; 

4 r = S n syj n c X 

5 {A@A^)Ay = Vb A&r; 

6 Ax — Q{A^Ay — r); 

7 Aw — S~^{rsw + W Ax)\ 

8 Az — Vc + Aw + QAx — A^ Ay\ 

9 Compute ap G (0, 1], ap G (0, 1]; 

10 X ^ X + apAx] 

11 (y, 2 :, w) G- (y, z, w) + ap{Ay, Az, Aw); 

12 end_while 
End-algorithm 



The specialized interior-point algorithm introduced in [4] for linear 
multicommodity problems exploited the constraints matrix structure of 
the problem for solving {A&A'^)Ay == b (line 5 of Figure 1), which is by 
far the most computationally expensive step. Considering the structure 
of A in (1) and accordingly partitioning the diagonal matrix 0 defined 
in line 3 of Figure 1, we obtain 



B 


c ' 




■ Ne^N'^ . 


0 


iV0i 




D 




0 














. . e^N'^ 


Eio©' J 



where 0^ = {{X^)~^Z^ + + QT\i = 0, 1 ... A;. Note that the 

only difference between the linear and quadratic case is term of 0\ 
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Moreover, as we are assuming that is a diagonal matrix, 0* can be 
easily computed. 

Using (5), and appropriately partitioning Ay and &, we can write 
{A@A^)Ay = b as 



B 


c ' 


' Ayi ■ 




bi 




D 


. A ?/2 . 




h 



By block multiplication, we can reduce (6) to 

{D-C^B-^C)Ay2 = (b2-C^B-%) (7) 

BAyi = (bi - CAy 2 ). (8) 



System (8) is solved by performing a Cholesky factorization of each 
diagonal block N&^N^i — 1 . . . of B. System with matrix H = 
D — the Schur complement of (5), is solved by a precondi- 

tioned conjugate gradient (PCG) method. A good preconditioner is 
instrumental for the performance of the method. In [4] it was proved 
that if 



m D is positive semidefinite, an 
■ D + C^BC is positive semidefinite, 
then the inverse of the Schur complement can be computed as 

H-^ = (9) 

The preconditioner is thus obtained by truncating the infinite power se- 
ries (9) at some term h (in practice h = 0 ot h = 1] dll the computational 
results in this work have been obtained with h = 0). Since 

k 

D, = Di + ^{QT\ 

i=0 

Dq and Di denoting the D matrix for a quadratic and linear problem 
respectively, it is clear that for quadratic multicommodity problems the 
above two conditions are also guaranteed, and then the same precondi- 
tioner can also be applied. Moreover, since we are assuming diagonal 
matrices, for /i = 0 the preconditioner is equal to H~^ = which 
is also diagonal, as for linear multicommodity problems. This is instru- 
mental in the overall performance of the algorithm. More details about 
this solution strategy can be found in [4]. 
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The effectiveness of the preconditioner is governed by the spectral 
radius of D~^{C^BC))^ which is always in [0, 1). The farthest from 1, 
the better the preconditioner. According to the computational results 
obtained, this value seems to be less for quadratic problems than for 
the equivalent linear problems without the quadratic term, since fewer 
conjugate gradient iterations are performed for solving (7). Moreover, 
the number of interior-point iterations also decreases in some instances. 
This can be observed in Figures 2, 3 and 4. Figures 2 and 3 show the 
overall number of PCG and IP iterations for the linear and quadratic 
versions of the Mnetgen problems in Table 1 of Section 4. Both versions 
only differ in the Q matrix. Clearly, for the quadratic problems fewer IP 
and PCG iterations are performed. The number of PCG iterations per IP 
iteration has also been observed to decrease for quadratic problems. For 
instance. Figure 4 shows the number of PCG iterations per IP iteration 
for the linear and quadratic versions of problem PDS20 in Table 2 of 
Section 4. We chose this instance because it can be considered a good 
representative of the general behavior observed and, in addition, the 
number of IP iterations is similar for the linear and quadratic problems. 
A better understanding of the relationship between the spectral radius 
of D~^{C'^BC)) for the linear and quadratic problems is part of the 
further work to be done. 



Figure 2. Overall number of PCG 
iterations for the quadratic and linear 
Mnetgen instances. 




Figure 3. Overall number of IP it- 
erations for the quadratic and linear 
Mnetgen instances. 




4. Computational results 

The specialized algorithm of the previous section was tested using 
two sets of quadratic multicommodity instances. As far as we know, 
there is no standard set of quadratic multicommodity problems. Thus 
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Figure 4- Number of PCG iterations per interior-point iteration, for the quadratic 
and linear PDS20 instance. 




IP iteration 



we developed a meta-generator that adds the quadratic term 

k n 
1=1 j=l 

to the objective function of a linear multicommodity problem. Coeffi- 
cients Qj are randomly obtained from an uniform distribution C/[0, (7], 
where 

E k i 

i=l Z^j=l 

kn 

in an attempt to guarantee that linear and quadratic terms are of the 
same order. 

We applied our meta-generator to two sets of linear multicommodity 
instances obtained with the well-known Mnetgen [1] and PDS [3] gener- 
ators. Tables 1 and 2 show the dimensions of the instances. Columns 
“m”, “n”, and “/j” give the number of nodes, arcs and commodities of 
the network. Columns ‘n” and “m” give the number of variables and 
constraints of the quadratic problem. The Mnetgen and PDS generators 
can be downloaded from 

http : //www . di . unipi . it /di/groups/opt imize/Data/MMCF . html. 

We solved both sets with an implementation of the specialized interior- 
point algorithm, referred to as IPM [4], and with CPLEX 6.5 [8], a state- 
of-the-art interior-point code for quadratic problems. The IPM code, as 
well as a parallel version [5], can be downloaded for research purposes 
from 
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Table 1. Dimensions of the quadratic Mnetgen instances. 



Instance 


m 


n 


k 


h 


ifi 


Me4-4 


64 


524 


4 


2620 


780 


Me4-8 


64 


532 


8 


4788 


1044 


M64-16 


64 


497 


16 


8449 


1521 


M64-32 


64 


509 


32 


16797 


2557 


M64-64 


64 


511 


64 


33215 


4607 


Mi28-4 


128 


997 


4 


4985 


1509 


Mi28-8 


128 


1089 


8 


9801 


2113 


Mi28-16 


128 


1114 


16 


18938 


3162 


Mi28-32 


128 


1141 


32 


37653 


5237 


Mi28-64 


128 


1171 


64 


76115 


9363 


Mi28-128 


128 


1204 


128 


155316 


17588 


M256-4 


256 


2023 


4 


10115 


3047 


M256-8 


256 


2165 


8 


19485 


4213 


M256-I6 


256 


2308 


16 


39236 


6404 


M256-32 


256 


2314 


32 


76362 


10506 


M256-64 


256 


2320 


64 


150800 


18704 


M256-I28 


256 


2358 


128 


304182 


35126 


M256-256 


256 


2204 


256 


566428 


67740 



Table 2. Dimensions of the quadratic PDS instances. 



Instance 


m 


n 


k 


h 


rh 


PDSl 


126 


372 


11 


4464 


1758 


PDSIO 


1399 


4792 


11 


57504 


20181 


PDS20 


2857 


10858 


11 


130296 


42285 


PDS30 


4223 


16148 


11 


193776 


62601 


PDS40 


5652 


22059 


11 


264708 


84231 


PDS50 


7031 


27668 


11 


332016 


105009 


PDS60 


8423 


33388 


11 


400656 


126041 


PDS70 


9750 


38396 


11 


460752 


145646 


PDS80 


10989 


42472 


11 


509664 


163351 


PDS90 


12186 


46161 


11 


553932 


180207 
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Table 3. Results for the quadratic Mnetgen problems 



Instance 


CPLEX 6.5 


IPM 




fcPLEX flPM 
CPLEX 


CPU 


n.it. 


CPU 


n.it 


Me4-4 


0.7 


12 


0.3 


18 


-2.0e-6 


M64-8 


3.1 


12 


0.8 


20 


5.6e-7 


M64-16 


10.7 


15 


1.6 


21 


-1.6e-6 


M64-32 


20.8 


16 


4.3 


25 


1.9e-6 


M64-64 


46.8 


14 


10.7 


31 


-l.le-6 


Mi28-4 


2.8 


11 


0.8 


17 


1.2e-7 


Mi28-8 


12.6 


11 


2.1 


21 


2.5e-6 


Mi28-16 


80.5 


13 


5.9 


28 


6.9e-6 


Mi28-32 


153.6 


14 


15.7 


35 


2.8e-6 


Mi28-64 


305.5 


14 


35.3 


36 


-1.4e-6 


Mi28-128 


741.9 


15 


98.8 


48 


-5.7e-7 


M256-4 


13.1 


13 


2.7 


20 


-7.9e-6 


M256-8 


73.8 


14 


6.7 


22 


2.4e-5 


M256-I6 


634.1 


15 


22.5 


34 


2.3e-5 


M256-32 


1105.2 


16 


49.9 


36 


2.4e-6 


M256-64 


2102.2 


16 


140.0 


53 


4.9e-7 


M256-I28 


4507.3 


17 


327.6 


62 


5.0e-6 


M256-256 


11761.3 


24 


835.3 


85 


7.0e-6 



Table 4- Results for the quadratic PDS problems 



Instance 


CPLEX 6.5 


IPM 




^CPLEX Apm 


CPU 


n.it. 


CPU 


n.it 


CPLEX 


PDSl 


1.6 


23 


1.3 


29 


-2.7e-7 


PDSIO 


234.8 


43 


78.6 


62 


-6.6e-7 


PDS20 


1425.6 


55 


271.0 


69 


1.9e-6 


PDS30 


5309.8 


76 


938.3 


96 


-6.0e-6 


PDS40 


10712.3 


79 


1965.2 


105 


-4.1e-6 


PDS50 


14049.7 


80 


3163.3 


114 


-4.1e-7 


PDS60 


17133.4 


71 


3644.2 


95 


3.6e-6 


PDS70 


25158.3 


74 


5548.7 


101 


-1.9e-7 


PDS80 


26232.1 


74 


7029.9 


100 


-1.3e-6 


PDS90 


32412.9 


77 


9786.7 


109 


-1.2e-6 




Solving Quadratic Multicommodity Problems through an IP Algorithm 209 

http : //www-eio .upc . es/'" j castro. 

For each instance, Tables 3 and 4 give the CPU time in seconds 
required by IPM and CPLEX 6.5 (columns “CPU”), the number of 
interior-point iterations performed by IPM and CPLEX 6.5 (columns 

“n.it.”), and the relative error of the solution obtained 

CPLEX 

with IPM (assuming CPLEX 6.5 provides the exact optimum). Execu- 
tions were carried out on a Sun Ultra2 2200 workstation with 200MHz, 
1Gb of main memory, and ?^45 Linpack Mflops. 

Figures 5-8 summarize the information of Tables 3 and 4. Figures 
5 and 6 show respectively the ratio between the CPU times of CPLEX 
6.5 and IPM, and the number of interior-point iterations performed by 
CPLEX 6.5 and IPM, with respect to the dimension of the problem (i.e., 
number of variables), for the Mnetgen instances. The same information 
is shown in Figures 7 and 8 for the PDS problems. 



Figure 5. Ratio of the execution 
times of CPLEX 6.5 and IPM for the 
quadratic Mnetgen problems. 



Figure 6. Number of IP iterations 
performed by CPLEX 6.5 and IPM for 
the quadratic Mnetgen problems. 




Figure 7. Ratio of the execution 
times of CPLEX 6.5 and IPM for the 
quadratic PDS problems. 



Figure 8. Number of IP iterations 
performed by CPLEX 6.5 and IPM for 
the quadratic PDS problems. 
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From Figures 5 and 7, IPM was in the all cases more efficient than 
CPLEX 6.5 (the ratio time was always greater than 1.0). For some Mnet- 
gen and PDS instances IPM was about 20 and 5 times faster, respec- 
tively. It is important to note that IPM makes use of standard Cholesky 
routines [16], whereas CPLEX 6.5 includes a highly tuned and optimized 
factorization code [2]. Therefore, in principle, the performance of IPM 
could even be improved. Looking at Figures 6 and 8 it can be seen that 
IPM performed many more interior-point iterations than CPLEX 6.5. 
This is because, unlike CPLEX 6.5, the current version of IPM does 
not implement Mehrotra’s predictor-corrector heuristic. In [4] it was 
shown that Mehrotra’s heuristic was not appropriate for linear multi- 
commodity problems. However, for quadratic problems, and because of 
the good behavior of the preconditioner, it could be an efficient option. 
Adding Mehrotra’s strategy to IPM is part of the additional tasks to be 
performed. 

Finally, we compared IPM and CPLEX 6.5 with PPRN [6] and with 
an implementation of the ACCPM [12] that we developed using the 
standard ACCPM library distribution [17]. For this purpose we chose 
some of the smallest Mnetgen and PDS instances, whose dimensions 
are shown in Tables 5 and 6 (columns m, n, fc, rh and n, with the 
same meaning as before). These Tables also give the execution time in 
seconds (columns “CPU”) for each solver. Clearly, CPLEX 6.5 and IPM 
outperformed both PPRN and ACCPM. Moreover, PPRN and ACCPM 
seemed not to be competitive approaches for quadratic multicommodity 
flows. (On the other hand, unlike CPLEX 6.5 and IPM, they can deal 
with nonlinear objective functions.) 



Table 5. Dimensions and results for the small quadratic Mnetgen problems. 



Instance 


m 


n 


k 


fi 


rh 


CPLEX 


CPU 

IPM PPRN 


ACCPM 


Me4-4 


64 


524 


4 


2620 


780 


0.7 


0.3 


6.0 


158.0 


M64-8 


64 


532 


8 


4788 


1044 


3.1 


0.8 


38.0 


2116.9 


M64-16 


64 


497 


16 


8449 


1521 


10.7 


1.6 


184.6 


5683.4 


M64-32 


64 


509 


32 


16797 


2557 


20.8 


4.3 


failed 


15753.4 


M64-64 


64 


511 


64 


33215 


4607 


46.8 


10.7 


12710.1 


34027.3 



5. Conclusions and future tasks 

From the computational experience reported, it can be stated that the 
specialized interior-point algorithm is a promising approach for separable 
quadratic multicommodity problems. Among the future tasks to be 
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Table 6. Dimensions and results for the small quadratic PDS problems. 



Instance 


m 


n 


k 


h 


rh 


CPLEX 


CPU 

IPM PPRN 


ACCPM 


PDSl 


126 


372 


11 


4464 


1758 


1.6 


1.3 


75.5 


failed 


PDS2 


252 


746 


11 


8952 


3518 


5.2 


7.9 


293.3 


failed 


PDS3 


390 


1218 


11 


14616 


5508 


10.2 


10.5 


903.4 


failed 


PDS4 


541 


1790 


11 


21480 


7741 


22.4 


33.9 


1702.8 


failed 


PDS5 


686 


2325 


11 


27900 


9871 


53.2 


44.7 


2631.3 


failed 



performed we find a deep study of the behavior of the spectral radius 
of BC)^ the addition of Mehrotra’s predictor-corrector method, 

and using the algorithm in a network design framework. 
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Abstract Nonlinear constrained optimal control problems as a rule suffer from 
the so-called two-norm discrepancy, which in particular says that under 
stable optimality conditions the objective functionals satisfy a quadratic 
local growth estimate in terms of the L2 norms but in Loo neighborhoods 
of the solution only. Furthermore, in the case of weak local optima with 
continuous control functions, stability w.r.t. parameter changes usually 
can be expected to hold in Loo sense rather than in Lp. 

Whenever we consider problems with discontinuous optimal control 
behavior, these results are too restrictive to discuss general variations 
of the solution including changes in the break points or switches in the 
active sets. In the paper we show how the use of certain integrated opti- 
mality criteria obtained via a duality approach allows for estimates also 
in the case of discontinuous controls. We consider L2 and Li quadratic 
growth estimates and discuss consequences for the behavior of minimiz- 
ing sequences. 

Keywords: Constrained control problems, sufficient optimality conditions, stability. 

1. Local optimality criteria in integrated form 

Consider first a general nonlinear constrained optimal control problem 
{primal problem formulation): 

pT 

(p) min J{x^u) — k{x{0),x{T)) + / r{t^x{t)^u{t)) dt 

Jo 



s.t. 




a.e. in [0, T], 


(1) 




II 

0 




(2) 




g{t,x{t),u{t)) < 0 


a.e. in [0, T] . 


(3) 
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The pair (x,u) e 1F4(0, l;iR") x Loo(0, is called admissible for 

(P) if the state equation (1) (including the boundary conditions (2)) 
together with the inequality constraints (3) (where g : [0,T] x IR" x 
iR™, P : [0,T] X JR" x JR*^ — >■ JR®) is fulfilled. All data functions 
are assumed to be sufficiently smooth. 

Denote by H the HAMiLTONian, and by H the augmented HAMiLTONian 
related to the problem (P): 

H{t,x,u,p) = r{t,x,u) + p^f{t,x,u) , 

H{t^x^u^p^ fi) = H(t^x^u^p) + pFg{t^x^u) , /i > 0. 

Further, let W stand for the set 

W = { (t, X, u) : t e [0, 1], g{t^ rr, u) < 0 } . 

Consider the dual variable S given by a function S : [ 0 ,T] x IRP' -> M 
and the auxiliary functional 0 for ^ 1^2 ^ 

0(6,6, S) = A;( 6 , 6 ) + *5(0,6) - *5(T,6) • 

We assume that S is at least Lipschitz continuous w.r.t. (t, x) whenever 
(t, x^ u) E W. Define 

T{S) = inf{0(6,6,*5) : ^(^ 1 , 6 ) = 0} 

Then the following problem is dual to the original control problem: 

(D) max T{S) 

s.t. H{t^x^u^ Sx(t^x)) + St{t^x) > 0 

a.e. on W = {{t,x^u) : t E [0, 1], g{t^ x^u) < 0 } 

If (x, u) is an admissible pair for problem (P) and S is feasible for (D) 
then the duality relation ([9], also [21] or [4]) holds, i.e. 

J{x,u) > T{S) . 

The relation turns into an equality if and only if for some admissible 
(a;o,^o) a,nd feasible dual S 

^{xo,uo, S)^ [ [H{t,x{t),u{t),Sx{t,x{t))) + St{t,x{t))]dt = 0, (4) 

Jo 

i;{xo{0),xo{T), S) = 0(xo(O),rro(T), S) - T{S) = 0 . 

In this case, the pair (a;o,uo) is a solution of (P). 
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The analysis of the behavior of ^ and -0 can be further used to charac- 
terize local minima of (P) in detail including estimates for local growth 
terms if available (cf. e.g. [7]). It can be also distinguished between 
weak and strong local optima in dependence of the reference sets for 
which optimality holds. To this aim, let us introduce the sets 

We ^ W n = IT n ( [0, 1] X Be{xo) X Too ) 

or IT, = IT n ( [0, 1] X Be{x^) x Tm(0) ) 

where IT^ with a given constant M > 0 is used to check for so-called 
bounded- strong local optima (see e.g. [18] or [20]). 

Abstract optimality criteria have been given in [17], [21] or (in slightly 
generalized formulation) in [7], [6]. The results are summed up in the 
following Theorem: 

Theorem 1 Let {xo^uq) be admissible for (P). Suppose that a function 
S : [0,T] X IR^ IR exists which is Lipschitz continuous w.r.t. x and 
piecewise continuously differentiable w.r.t. t such that for a suitably cho- 
sen positive constant e the following relations hold with 7 = 1, D{t) = 
ITe(^) and = Be(xo{0),xo{T)): 

(Rl) T(rr, u]S) >0 

V {x^u) ^ {xq^uq) with {x{t)^u{t)) e D{t) a.e. in [0,T]; 
(R2) ^(xo,uo;5) = 0; 

(R3) >0 G C7^(xo(0),xo(T)). 

Then {x^^uq) is a strict weak local minimizer of (P). 

If (Rl) - (R3) hold true for a certain constant e > 0 with 7 = 0, 
D{t) = We{t) and = Be{xo{0)^xo{T)), then {xq^uq) is a (strict) 
strong local minimizer. 

If the conditions are satisfied with D{t) = We{t) for some M > 0, the 
point {xq^uq) is a (strict) bounded- strong local optimum. 

The above Theorem differs from former results ([21], [17]) mainly by 
the consequent usage of the HAMILTON- JACOBI inequality (cf. (D)) 
in integrated form. This fact corresponds to some relaxation in the 
characterization of local minima by duality means and was first used 
in [4] for the theoretical convergence analysis of certain discretization 
methods. 

Notice that in some cases it is possible to find estimates for ^ (x, u; S) — 
"if{xo^uo] S) in terms of \\x — xqUI + ||'^^ ~ ^ local quadratic 

growth condition w.r.t. L 2 topology ([17], [7]) although the reference 
sets are neighborhoods w.r.t. Loo (^f least with respect to x). This fact 
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illustrates once more the effects of the so-called two-norm discrepancy 
appearing in optimal control problems (cf. [10] for details). 

The optimality criteria of Theorem 1 in their original parametric for- 
mulation have been used in [17] to derive suflficient optimality conditions 
for general control problems. The approach leads to conditions which 
in a different way had been obtained via the investigation of so-called 
stable weak optima ([12] or [15], see also [14]). 

In particular, it was shown that the following criteria ensure the strict 
weak local optimality of a solution pair (xq^uq): 

(PMP) Pontryagin’s maximum principle: 

i = Hp = f, l3{x{0),x{T))=0; 

P = -Hx, p{Q) = -V ik - V p, 

p{T) = V2k+ p, peM"; 

Hu = 0 ; 

p^g = 0, p > 0, g < ^- 

Notice that in the transversality condition given above the subscripts 
1,2 stand for the gradient components corresponding to the initial and 
to the final state vectors respectively. 

For a given admissible pair (o^o, “Uo)? l^t /, . . . be evaluated along the 
state-control trajectory. We will denote by 5^, cr > 0, the set of cr-active 
constraints at t, e.g. the set of gi such that 0 > gi{t^XQ{t)^UQ{t)) > —a. 

(Cl) Invertibility: For some positive constant a, the gradients w.r.t. 
u of the a- active constraints, Vu9^ ^ Q-re uniformly linearly inde- 
pendent a.e. on [0,T]. 

(C2) Controllability: There are functions y G v G L^o satisfying 

= 0 ; y - Vxf y - Vuf v = Q a.e. 

together with the boundary condition Vi^^y(O) + V 2 $^y{T) = 0. 

In order to formulate second-order conditions, the index set = (i : 
(Jii > 6} and the related tangent spaces 

Ts = {C: (V(,,„)5'fC = 0 Vi 6/5}, 

T's = {v. {Vug^Vv = ^^i€h) 

are useful. The conditions then can be given as follows: 

(C3) Legendre- Clebsch Condition: For some positive 5, a constant 
O' > 0 exists such that the estimate ^ a\v\^ holds 

V G uniformly a.e. on the interval [0,T]. 
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(C4) Riccati Inequality: With the abbreviations R = —(Vug^)~^'Vx9^ 
and P = where ( • )~^ stands for the pseudoinverse and 

( • )-^ for the null- space projector of a matrix ( • ), introduce 

= fx+fuR. 

h^x ~ Rxx + HxuR + R^ Hux + R^ HuuR 

(/iL) = p [PHnuPy^P, hi^ = Hxu + R^Huu . 

For some 7 > 0, the matrix differential inequality 

Q y {hL + Qfn){hL)^^^\hL + Qfuf 

- {a^f Q - Qa^ - hix, + ll a.e. on [0,T] 
has a bounded in [0, 1] solution satisfying the boundary restrictions 

^^(v?fc + v?^P + Q(0))^ > Tl-ei" V^eiR" 

e{^lk + VlPp-Q[l))^ > 7iei' VeGiR^ 

It is known that (Cl) - (C4) not only guarantee (Rl) - (R3) (see 
[7]), but also a) the Lipschitz stability of the solution in L^o w.r.t. 
small data perturbations ([11], [15], [3]), b) (in the case of a continuous 
control uo) the convergence of the Euler and related discretization 
methods ([14], [4], [2]). 

2. Bounded-strong minima. Piecewise 
conditions 

are formulated as integral conditions in terms of When we are 
interested in local growth estimations, this fact gives us the chance to 
combine piecewise changing growth characterizations as they are typical 
for switches in the optimal control or for the junction of free arcs and 
arcs where certain constraints are active. 

We begin our optimality analysis with the auxiliary functional ^ — 
S) from (4), where the integrand has the form 

R[t] = {H{t,x,u,VxS) + St) [t] . (6) 

As it has been shown in [7], using ^(xo,uo, S) = 0 together with an 
expansion 

S{t,x) - So{t) + p{t)'^ {x - xo(t)) + 0.5 {x - xo{t))'^Q{t) {x - xo{t)) 
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and the optimal data VxS = po + Q{x — xq) and So = — ro, one can 
express R in the following way: 

i?[i] = H{x,u,po,tio) - H{xo,uo,po,fJ-o) - Hx{xo,uo,po,IJ-o)'^{x - xo) 
+0.5 {x - xofQ{x - xo) + (x- xofQ ( /(a;, u) - f{xo, uq) ) 
+ IXo{9{x,u) -g{xo,uo)) 

By rearranging the terms related to variations w.r.t x or u, one can 
separate two terms i?i^2 such that i?[t] = and (with p — 

Po + Q(x - Xo)) 

= H{x,uo,p,p.o) - H{xo,uo,p,p.o) 

- Hx{xo, uo,po, Po)^{x - Xo) + 0.5 {x - xq)'^Q{x - xq) 

« 0.5 {x - Xo)'^ [Hxx + Qfx + /JQ + Q) (a: - a:o); 

R 2 [t] = H{x,u,p,go) - H{x,uo,p,po) - Po 9 {x,u) 

Under condition (C4), in particular Ri[t] will be uniformly positive if 
11^ — ^olloo is sufficiently small. 

In order to estimate ^ resp. in [6] the following general result 
has been proved (Theor. 5 in the cited paper): 

Theorem 2 Suppose {xq^uq) to be a weak local minimizer satisfying 
together with some matrix function Q G the conditions (Cl) - (C4). 
Let 



R2[t] > ci\u — uo{t)\^ — C2\x — xo{t)\ Via : |ia| < M ( 7 ) 

hold almost everywhere on [0,1] with v{t) G {1,2} and constants not 
depending on t. Then {xo^uo) is a bounded- strong local minimizer, and 
positive constants c, e exist such that 

J{x,u) — J{xq,uo) > c\\x — X0W2 (8) 

for all admissible {x,u) with \\x — a;o||oo ^ Halloo ^ 

Important special cases are the following: 

Case 1: H{xo,v,po, jJio) is strongly convex w.r.t. v. 

In this case, R 2 > ci|u — uqP — 0(| x — xq\), - a situation which can 
be often observed when H takes its minimum in an inner point of the 
control set. 

Case 2: uq arg min H{xo,v,po, / jLo), and —pQ g{xo>,u) > c[\u — uo\. 
Here one can conclude that i ?2 > c[\u — uq\ — 0(| rr — o:o|). The situation 
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occurs when a certain strict complementarity condition holds together 
with the invertibility assumption (Cl). 

The criteria given above have been discussed in detail in [6] and tested 
on a nonlinear example with generically discontinuous optimal control 
regime in [7]. As a further illustration, we consider here an example 
from [14]: 

Example: The tunneldiode oscillator. 

min J{x^u) = j (^u^{t) + x^{t)^ dt 

s.t. ±i = X 2 , a;i(0) = 3 : 2 ( 0 ) = -5, 3;i(T) == X 2 {T) = 0, 

X 2 — —3:1 + 3:2(1 . 4 — 0.143:2) + 
for I I < 1 CL.e. on[0,T]. 

In the paper [14], the conditions (Cl) - (C4) were checked numerically. 
In particular, by using a multiple shooting method a bounded matrix 
function Q was constructed so that the RiCCATl condition was satisfied. 
Thus it is reasonable to assume that a solution to (C4) exists with 
llQIloo < and that (Cl) - (C3) hold true for some positive a and 6. 
In the example situation for T = 4.5 the structure of the optimal control 
was obtained as = 1 on [0, ri), u = —1 on {t 2 ^t^) and u = — 2 p 2 
elsewhere (where 0 < < T 2 < rs < T are found approximately from 

the numerical solution). The points are the so-called junction points. 
Consider the term i ?2 from our above growth analysis for 
Using the example data, we get 

i?2 == -Uq Ap 2 {u - Uo) -f- Ao,l(l - Uo) + Ao,2(^^0 + 1) 

{2uo + ^Po^ 2 ){u - Uo) + {u- Uo)^^ + 4:{u - Uo)[Q{x - 3:o)]2 
> min |(Ai + A 2 ) \u — uq\^ 0.5 \u — ?ioP} — c |3: — xq\ 

whenever |3: — 3:o| < e < 1. Therefore, the conditions of Theorem 2 are 
applicable in the example, and the considered solution turns out to be 
a bounded-strong local minimum satisfying a local L 2 quadratic growth 
estimate. 

3. Bounded-strong optimality in 
bang- bang case 

Consider a dynamical system with linear in state and control equation 
and known initial position. We will ask for a control regime which in 
given time gains the system to a final state as close as possible to the 
origin. 
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In general, in an arbitrarily given time the system not always can be 
terminated. Since a (small) deviation from zero in the final position is 
allowed, this problem class is also called soft termination control As a 
model case, consider a problem with box control constraints: 

(Ps) min J{x,u) = ^||o;(T)|p 

s.t. x{t) = A{t)x{t) + B{t)u{t) a.e. in[0,T]; (9) 

a;(0) = a; (10) 

< 1) * = a.e. in[0,T]. (11) 

The Hamilton function related to (Ps) has the form 

H{t,x,u,p) = p^A{t)x + p^B{t)u , 

whereas the augmented HAMiLTONian reads as 

H{t, X, u,p,p.) = H + fij {u — e) — (^ + 

with /^i ,2 > 0, e = (1, 1, . . . , 1)^. 

From Pontryagin’s maximum principle, we obtain the switching func- 
tion 

a{t) = B{tfp{t) , (12) 

where the costate p satisfies the adjoint equation 

p{t) = -A{t)'^p{t) , p{T) = x{T) , 
and the optimal control is given by 

r {+1} if ai{t) < 0, 

uo,i € < {-1} if >0, i = l,...,k. (13) 

[ [-1,+1] if cTi{t) = 0, 

Further, the multiplier functions p.j suffice the relations 

pi{t) = (5(t)^p(t))_ , p2{t) = (j5(t)^p(i))^ 

where the right-hand sides denote the positive resp. negative part of the 
related vector components. 

In the case that Gi{t) = 0 on a certain interval I C [0,T], we say the 
control uq has a singular arc (on /). In our optimality analysis, we will 
restrict ourselves to the case of piecewise constant uq without singular 
arcs: 

Assumption 1 The optimal control has no singular arcs. In addi- 
tion, the set of switching points S = { t G [0, T] : 3 i G {1, . . . , m} with 
(Ji{t) = 0 } is finite, i.e. — {tg 1 ^ 5 < Z } for some I E N. 
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Remark It is well known, that the above assumption holds true e.g. in 
the case that A is time-independent and has exclusively real eigenvalues. 

Under the Assumption 1, the solution obviously satisfies (Cl), (C2): 
Indeed, = diag {jj) with 7 j G {+1,-1}, and the controllability 

assumption (C2) in the case of a (linear) initial value problem always 
trivially holds. 

When we try, however, to apply the second order optimality criteria 
(C3) and (C4) to our problem, serious difficulties occur: 

First of all, the LEGENDRE - Clebsch condition (C3) becomes sin- 
gular since Huu = 0 a.e. on [0,T]. It can be fulfilled formally only in 
the limit sense with 6 = 0 where To == {0} for all t G [0,T]\S. 

The limit case of the corresponding RiCCATl inequality for 5 ^ 0 gives 

Q T A Q + Q k 7 / a.e. 

I - Q{T) h 

which obviously can be fulfilled with arbitrary positive 7 . In particular, 
one can choose Q\ such that in both parts of (14) equality holds with 
7 = 0.5. Then for 7 < 0.5, the matrix function Q 7 = 27 Q 1 solves (14). 
Moreover, it belongs to Loo(0,T; with IIQ 7 II 00 = 27||Qi||oo < 

c(^)7- 

It has to be mentioned, however, that in the case of a singular matrix 
Huu the conditions (C3), (C4) with 6 = 0 in general are not sufficient 
to show the optimality of the solution even in the weak local sense. For 
our problem class e.g. the linearization in the (nearly-)active constraints 
will allow only for zero control variation, which together with the admis- 
sibility assumption for the linear state equation case reduces the state 
variation to zero, too. The local technique of deriving estimates for ^ 
(resp. for J — J*) from its TAYLOR expansion (see [21] and [7], [ 6 ]) 
therefore fails in the given situation. 

Having in mind these arguments, let us restart the optimality analysis 
for S) (cf. (4)) with the integrand R = Ri R 2 from ( 6 ). 

Notice that from f{x^ u) = Ax+Bu we obtain in particular the following 
estimate for R\ near xq: 

Ri[t] ^ 0.5 {x - xo)^ (^Q + Q A + A^Q + Hxx^ (^ - ^ 0 ) • 

In the case when (P 5 ) is considered, we have Hxx = 0. Choosing Q = Qy 
and y = X — xq such that ||y||oo < with sufficiently small ei > 0 we 
get 

R,[t] > 0.25y^(g + QA + A^q) y > | |yp , 



(15) 
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We will further derive appropriate estimates for i ?2 and the integrals 
^ under the following regularity assumption on the zeros of the switching 
function a: 

Assumption 2 There exist positive constants cq, S with the following 
property: For 5 E (0, S)^ denote us = Ui<s<z(^s — S^tg + S). Then, 

min I ( B^p{t)^ I > cqS Vt E [0,Tl\a;j . 
i I V / i\ 

In the case of problem (P 5 ), the term R 2 connected with variations 
w.r.t. u may be expressed by 

R 2 [t] = H{x,u,pq,ijlq) - H{x,uq,pq,ij,()) + {x - xof'QB{u- uq) 

- uo) + /io,2(^^ - ^o)- 

Denoting v = u — uq with an arbitrary feasible control u, we have: 

(^B^po^, >0 ^ (^o)i = — 1 > 0 

(^B'^Po^.<0 ^ {uo)i= +1 ^ Vi<0. 

Therefore, 

T 

R 2 [t] = V + y^QBv > ^ jB'^po .■ \vi\ - y^QBv 

i=l 

In order to estimate the integral over R 2 , each part of the right-hand 
side will be integrated separately and estimated now: 

Ji = [ y^QBv dt < ||y||oo||Q||oo||-B||oo [ \v{t)\dt . 

Jo Jo 

From the state equation it follows that ||y||oo ^ c(A, 5) ||u||i, thus 

Ji < ||Q||oo||5||ooc(A5)lkll? =: 7C2\\v\\1 

when Q = and for a certain constant C 2 = C 2 (A, B) > 0. 




. . . dt “h 
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it is easy to see that the second part is nonnegative. Neglecting this 
integral (which for small 6 is of small size), from Assumption 2 we con- 
clude 



« m ^ 

J 2 ^ \ viit)\dt > co^ / cJt 

(where |i?| stands for the EuKLiDean vector norm of = v{t) in IRL^). 
On the other hand side, 

Iblli — [ \v{t)\dt = [ b(^)| dt + f |t’(^)| dt 

Jo J[ 0 ,T]\ljs Jcjs 

^ “h 

J[0,T]\us 

for Cl = 4tkM if only ||r^||oo and ||uo||oo are bounded by M (M = 1 in 
(P5) e.g.). Inserting this estimate into our relation for J2, we obtain 

J2 > cqS ( - cid) . 

Combining now the estimates for Ji and J2, an estimate for / R2dt 
results: ^ 

[ R 2 [t]dt > CqS (ll'^lli - cid) - C 27 ||'y||i • ( 16 ) 

Jo 



LEMMA 1 (Weak local optimality.) 

Let the Assumptions 1, 2 hold for the optimal data. Then positive con- 
stants e, Cyj exist such that 

R[t]dt > Cyj (^\\ X — X 0 W 2 + \\ u — uoWi ) (1"^) 

for all admisssible (x,u) G Therefore^ (xq^uq) is a strict weak local 
minimizer (with J{x^u) — J{xq^uq) satisfying a local quadratic growth 
condition similiar to (17)). 



Proof. Our previous analysis allows to discuss the estimate (16) for 
various S. In particular, choose 



5 




Then, for ||^>||oo - '^olloo < e < 62 = (2ci5)/T, 
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and we obtain 

^ R2[t]dt > (^^ - C 27 ) Iklll =: c(7)||w||f . 

If 7 is taken from ^ 0, ^ then c(j) is positive, so that together with 

(15) the last estimate leads to 

[ R[t] dt > 7 II 3; - a;o||2 + 0 ( 7 ) || u - uo\\l . 

Jo 4 

Thus, the desired conclusion follows by setting = max^>o min{^, 0 ( 7 )} 
and e = min{ei, 62 }. □ 



LEMMA 2 (Strong local optimality.) 

Let the Assumptions 1, 2 hold true. Then positive constants e, Cg exist 
such that 




R[t] dt > Cs 



^ -^olloo 



for all admisssible (rr,?i) G . (18) 



Therefore, {xq^uq) is a strict strong local minimizer (with J{x,u) — 
J{xq,uq) satisfying a local quadratic growth condition similiar to (18)). 

Proof: Consider (16) with 



6 = min 



1 6 
2 Cl’ 2MT 



^^iii 



csiblli . 



Since ||t'||oo < 2M, we have S < 27 ^^ Halloo < ^ so that, for 
7 < (coC 3 )/( 4 c 2 ) and ||y||oo < ei , 

f R[t]dt > ^^\\v\\l . (19) 

follows. Notice that due to the state equation we have 

llylloo < c{A,B) ||u||i, 

and consequently (18) holds for Cg < coC3/(4c^(A,B)). □ 

Remark: Notice that in the case of a compact control set as in the model 
problem (P 5 ) the definitions of strong and of bounded-strong local op- 
timality coincide. 
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As a conclusion from the last two Lemmas we easily obtain: 

Theorem 3 Let the Assumptions 1, 2 he satisfied for the solution (xq, 'Uq) 
of problem (Ps)- Then, {xq^uq) is a (bounded-) strong local minimizer, 
and positive constants c, e exist such that 

J{x,u) - J{xo,uq) > c (||a: - iColli + Ik - 'Uolli ) (20) 

for all admissible pairs with — a;o||oo ^ 

4. Minimizing sequence stabilization 

In this final section, it will be shown how the results of Theorems 2 
and 3 can be used to obtain certain preliminary convergence results for 
minimizing sequences of (P) resp. (P5). The results are orientated on 
[6] (Propos. 6). 

For convenience, the following assumption is made on the system dy- 
namics: 

Assumption 3 There exists a constant M > 0 such that for any piece- 
wise continuous u with ||u||oo < M the state boundary value problem (1), 
(2) has a bounded solution on [0, T]. If solutions corresponding to ui^2 
are denoted by x \^2 resp., then || xi — X2 ||oo cq || u\ —U2 ||i holds true 
for some constant cq > 0 

Consider first the situation of section 3 where the Legendre- Clebsch 
condition is fulfilled in the sense of (C3), and where Theorem 2 allows 
for a local quadratic growth estimation of the objective functional. 

LEMMA 3 Let {xq,uo) be a bounded- strong local minimizer for (P) 
and suppose the Assumptions 1-3 to hold true. Further, assume the 
estimate (8), i.e. 



J{x,u) — J{xq,uq) > c||rr — X0II2 

to he valid for \\x — xo||oo ^ If {xk^Uk) is a minimizing sequence with 
uniformly bounded and piecewise continuous controls u^, and if for all 
k, \\uk — is sufficiently small, then Xk xq in L 2 sense. 

The proof is similiar to that of Proposition 6 in [6]. Notice, that in 
practice a minimizing sequence can be obtained e.g. by the Euler 
discretization approach ([1], also [2]). The error estimates in the cited 
papers are given in terms of the maximal deviation in the discretization 
grid points, a condition which for piecewise continuous controls uq is 
sufficient for their L\ closeness to the solution. 
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We add an analogous result for the bang-bang situation, although in 
this case (without an appropriate coercivity assumption like (C3)) the 
convergence of discretization schemes and their benefit for constructing 
minimizing sequences theoretically is yet an open question: 

LEMMA 4 Let (a;o,i/o) be a bounded- strong local minimizer for (Ps') 
and suppose the Assumptions 1-3 to hold true. Further, assume the 
estimate (20), i.e. 

J{x,u) - J{xo,uo) > C ( ||a; - X0W2 + ||w - noH? ) 

to he valid for \\x — xo||oo ^ If {xj^,Uk) is a minimizing sequence with 
uniformly bounded and piecewise continuous controls Uk, and if for all 
k, \\uk—uo\\i is sufficiently small, then x^ Xq in L 2 , and moreover, 
Uk uq w.r.t. L\ topology. 
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Abstract No polynomial time algorithm is known for the graph isomorphism prob- 
lem. In this paper, we determine graph isomorphism with the help of 
perfect matching algorithm, to limit the range of search of 1 to 1 cor- 
respondences between the two graphs: We reconfigure the graphs into 
layered graphs, labeling vertices by partitioning the set of vertices by 
degrees. We prepare a correspondence table by means of whether labels 
on 2 layered graphs match or not. Using that table, we seek a 1 to 1 cor- 
respondence between the two graphs. By limiting the search for 1 to 1 
correspondences between the two graphs to information in the table, we 
are able to determine graph isomorphism more efficiently than by other 
known algorithms. The algorithm was timed with on experimental data 
and we obtained a complextity of O(n^). 

Keywords: Graph Isomorphism, Regular Graph 

1. Introduction 

The graph isomorphism problem is to determine whether two given 
graphs are isomorphic or not. It is not known whether the problem 
belongs to the class P or the class NP-complete. It has been shown, 
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however, that the problem can be reduced to a group theory problem 
(van Leeuwen, 1990). 

Most studies of graph isomorphism (Hopcroft and Wong, 1974; Lueker, 
1979; Babai et ah, 1980; Galil et ah, 1987; Hirata and Inagaki, 1988; 
Akutsu, 1988) restrict graphs by their characteristics. Some studies are 
undertaken based on group theory. Most studies are concerned on the ex- 
istence of algorithms (Filotti and Mayer, 1980; Babai et ah, 1982; Luks, 
1982; Babai and Luks, 1983; Agrawal and Arvind, 1996), and a few 
papers report the implementation of algorithms (Corneil and Gotlieb, 
1970) and experimental results. 

At present the best computational complexity by worst case analysis 
(Babai and Luks, 1983; Kreher and Stinson, 1998) is O This 

algorithm makes use of the unique certification of a graph. 

In the present paper, we consider the graph isomorphism problem for 
non-oriented connected regular graphs whose vertices and edges have 
no weight. We seek graph isomorphism by means of perfect matching 
to limit the range of 1-to-l correspondences between the two graphs as 
follows. 

First, we choose one vertex as root for each graph and reconfigure the 
graphs into layered graphs corresponding to the chosen vertices. Next, 
we label those vertices by partitioning the set of vertices by the distance 
from the root vertex. We construct a correspondence table which reflects 
whether labels on 2 layered graphs are the same or not. Then, referring 
to that table, we search for a 1-to-l correspondence between the two 
graphs. 

In other words, we create a bipartite graph between Vi and V 2 and 
find a perfect matching in this bipartite graph. 

In the worst case, we might enumerate all the combinations of vertices 
among the two graphs, which would be of exponential order. However, 
we have been successful in determining the isomorphism of graphs within 
a reasonable time using experimental data; these results are also reported 
in the present paper. 

We consider only regular graphs. Since the general graph isomorphism 
problem can be reduced to the regular graph isomorphism problem in 
polynomial time (Booth, 1978), this restriction does not lose generality. 

1.1. Perfect Matching Problem 

The matching problem on a bipartite graph is a problem that of 
finding a set of edges such that any two edges do not share the same 
vertex (Iri, 1969). If the set covers all the vertices, the set is called 
perfect matching. 
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Figure 1. Layered Graph 



It is known that there exist polynomial algorithms of finding a perfect 
matching. (Micali and Vazirani, 1980 etc.) 

1.2. Preliminaries 

Let the two given regular graphs be G\ — G 2 = (F 2 ,E' 2 ), 

where |Fi| = IF 2 I = I'^^l = n, |£;i| = |^ 2 | = |^| (= 0{n^)). Each 
vertex is uniquely labeled and is stored in an array of size n. Graph 
isomorphism is defined as follows. 

Definition 1 Two graphs G\ = (Vi,£'i) and G 2 = (V2?^2) iso- 
morphic, if there is a 1-to-l correspondence / : Vi — V 2 , such that 
{v,v') G El iff {f{v),f{v')) G E 2 for any {v,v') G E\. This function f 
is called an isomorphism between G\ and G 2 . 

Similarly we could define graph isomorphism in the case where one 
vertex is fixed in each graph. 

We consider only regular graphs for which the vertex degree satis- 
fies 3 < d < because of the relation between a graph and its 

complement. 

2. Reconfiguring Graphs to Layered Graphs 

In the present paper, we make use of layered graphs to determine 
isomorphism. 

2.1. Layered Graphs 

Given a graph G and a vertex r E V, the layered graph L{G, r) with 
root r consists of 

■ vertices of G, 
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■ edges of G, 

■ level (u) for each vertex 

where level{u) is the shortest distance (or the depth) from r to u (Figure 
1). Transforming a graph with n vertices to a layered graph can be done 
in 0{n?) time. 

2.2, Characteristics of Layered Graphs 

We divide the set of vertices adjacent to v into 3 subsets, Du(v)^ 
Ds{v)^ and Dd{v)^ as follows: 

■ Du{v) = {v' I {v^v') G E and level{v') = level{v) — 1}, 

■ Ds{v) = {v' I {v^v') E E and level{v’) = level{v)}^ 

■ Dd{v) — {v' I {v^v') E E and level{v') = level{v) + 1}. 

Let the number of vertices of each subset be du , <^ 5 , and dd : 

■ du{v) == |I>«(u)|, (upper degree) 

■ ds{v) = \Ds{v)\^ (same level degree) 

■ dd{v) = \Dd{v)\. (lower degree) 

It follows that the degree of v, d{v)^ is equal to du{v) + ds{v) + dd{v). 

It is trivial to derive at the following: 

■ du{r) = ds{r) =: 0, dd{r)=d{r), 

■ each vertex v except the root vertex satisfies du{v) > 1, 

■ all vertices adjacent to the vertices in level i have level i or (i± 1). 

Given these assumptions, we propose the following. 

Proposition 1 Two graphs G\ — iVi^Ei) and G 2 = (V2,F^2) 
morphic if and only if there are vertices v\{^ V\) and V 2 (E V 2 ) and the 
two layered graphs L(Gi,t?i) and L{G 2 ^V 2 ) are isomorphic. 

Each vertex v{e V) has a label^ ( level{v)^ du{v)^ ds{v), dd{v) ). Let 
the label be denoted by M{v). We call the set of vertices that have the 
same labels a “class,” which we denote by Bi {1 < i < A;, where k is the 
number of classes). For example, data from Figure 1 are shown in Table 
1 sorted by label. We denote by C{G^v) the vertices of G partitioned 
into classes. 



label for a general vertex is constructed by graph appending each vertix’s degree d{v) to 
the level. 
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Table 1. Example of Labeling. Data are from graph shown in Figure 1 
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dd(v) 
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class 
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4 



3. Finding a 1-to-l correspondence between two 
graphs 

In this section, we consider how to make use of perfect matching 
algorithm in order to determine the isomorphism of graphs. 

3.1. Correspondence between 2 Layered Graphs 

For two given graphs, we consider all layered graphs for which a vertex 
of the graph is the root. 

For Vi G Vi and Vj G V 2 , we set Cij = 1 if C{Gi^Vi) and C{G 2 ^Vj) 
have the same labels and partitions, otherwise Cij = 0. Thus, we have a 
correspondence table as shown in Table 2. 

It is easy to see that each table entry’s value is unique and does not 
depend on expressions of the two graphs. 



Table 2. Table of Layered Graphs 
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The entries with a value of 1 are candidates for a 1-to-l correspon- 
dence between vertices the two graphs. As a result, we could take that 
correspondence, by finding perfect matchings according to the table. 
(Figure 2) 
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In the graph isomorphism problem, we have to determine whether 
there exists a 1-to-l correspondence between vertices in two graphs 
checking all possible perfect matchings^ (in the correspondence table). 
Of course, the possible perfect matchings do not always indicate isomor- 
phism, so we have to enumerate all perfect matchings and to test for 
isomorphism. However, the table limits the range searched for a 1-to-l 
correpondence. 

If there is no perfect matching between two graphs based on this table, 
they are not isomorphic. 

3.2, Solutions and Issues 

We have implemented the above algorithm and in Section 4 applied 
it experimentally to determine isomorphism. We test for 1-to-l corre- 
spondence between vertices in two graphs as follows. 

■ Construct a 1-to-l correspondence table as preprocessing. 

■ Test for 1-to-l correspondence between vertices in the two graphs. 

Next, we enumerate 1-to-l correspondences one by one until we find a 
perfect matching between vertices in graphs. 

The program based on our algorithm and described in the next section 
has not adopted stronger methods to bound recursion, because we want 
to make it easier to understand effectiveness by using a table. 

However, if all entries in the table are I’s, we have to enumerate all 
perfect matchings. This results in many combinations of 1-to-l corre- 
spondence to test. This might be the worst situation for our algorithm. 
In such situation, however, we could consider 2 cases whether 2 graphs 
are isomorphic or not. 



^In practice, it is not necessary to enumerate those perfect matchings to determine isomor- 
phism. 
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In the former case, since both graphs would have much symmetry, we 
could find a 1-to-l correspondence earlier. In the latter case, we do not 
need to enumerate all perfect matchings as follows : 

■ Consider the two layered graph which both root vertices are cor- 
responding in the table. 

■ Within each corresponding class between the two layered graphs, 
test 1-to-l correspondences. 

- Examine the number of same vertices adjacent to 2 vertices 
in each corresponding class. 

- If there is no 1-to-l correspondence for at least one class, they 
are not isomorphic. 

Thus, we could reduce complexity of enumeration. 

Also we need to consider what features of graphs indicate the worst 
complexity. 

Among other known algorithms the best complexity in the worst case 
analysis is of time O (Babai and Luks, 1983; Kreher and 

Stinson, 1998). That algorithm determines isomorphism by certifying 
graphs uniquely. Though it certifies by partitioning the set of vertices 
recursively, the basic idea in partitioning is as follows : “which parti- 
tioned set contains vertices adjacent to a certain vertex?” To prevent 
unnecessary recursions, it takes advantage of certifications results. The 
complexity of certification is of exponential order. 

4. Experiments 

We have implemented the program described above and experimented 
on various regular graphs. 

4.1, Environment and Graph Data 

Our experiment was carried out with a Celeron 450MHz, 128 MB 
memory (and 128 MB swaps) and C (gcc-2.91.66) on Linux (2.2.14). 
We measured running time using a UNIX like OS command “time.” 

We have constructed various regular graphs for input using a pro- 
gram that was implemented according to Matsuda et ah, 1992. Those 
graphs have numbers of vertices from 20 to 120 with vertex degree of 10. 
We constructed not only various isomorphic graphs that have the same 
number of vertices and degree but also non-isomorphic ones. 
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As a result, we conclude that the experimental time complexity is 
proportional to O(n^) regardless of whether the graphs are isomorphic 
or not. These results tend to be closer to the complexity of making a 
correspondence table than of examining 1 to 1 correspondences (perfect 
matchings ) between the two graphs. Besides, we have seen almost the 
same results in both cases isomorphic and non isomorphic. 

We anticipated that complexity might be larger as all the perfect 
matchings might be enumerated in the non-isomorphic case, but the 
result of our experiment showed to be much more efficient. 

Differences between average time, maximum time and minimum time 
in the number of vertices and degree are very small, so the program 
is quite stable. Standard deviations in the results are also very small 
(though not shown here) and didn’t have any result over 1 second. Fur- 
thermore, in the non- isomorphic case, we could determine lack of iso- 
morphism by testing only the table (in the graphs used at least). 

5. Conclusions 

In the present paper, targeting nonweighted, undirected and con- 
nected regular graphs, we considered graph isomorphism by means of 
perfect matching to limit the range of 1 to 1 correspondence between 
two graphs as follows. First, we reconfigured the given graph as a lay- 
ered graph, labeled vertices by partitioning the set of vertices by distance 
from a root vertex, and prepared a correspondence table by means of 
whether labels on 2 layered graphs matched or not. Using that table, we 
find 1 to 1 correspondences between the two graphs. In our experiments, 
we could determine isomorphism within a practical and stable time. 

For further research, we have to examine other types of graphs, and 
analyse complexity of the program for them. Also, we wish to compare 
our results with practical running results of the best algorithm described 
in Babai and Luks, 1983 and Kreher and Stinson, 1998 whose worst 
complexity are known to have exponential time. 
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Abstract This paper deals with optimal control problems for semilinear time-de- 
pendent partial differential equations. Apart from the PDE, no addi- 
tional constraints are present. Solving the necessary conditions for such 
problems via the Newton-Lagrange method is discussed. Motivated by 
issues of computational complexity and convergence behavior, the Re- 
duced Hessian SQP algorithm is introduced. Application to a system of 
reaction-diffusion equations is outlined, and numerical results are given 
to illustrate the performance of the reduced Hessian algorithm. 

Keywords: optimal control, parabolic equation, semilinear equation, reduced SQP 
method, reaction-diffusion equation 



Introduction 

There exist two basic classes of algorithms for the solution of opti- 
mal control problems governed by partial differential equations (PDEs). 
They both are of an iterative fashion and are different in that Newton- 
type methods require the repeated solution of the (non-linear) PDE while 
the algorithms of SQP-type deal with the linearized PDE only. Newton- 
type methods have been successfully applied, e.g., to control problems 
for the Navier-Stokes equations in [4] and will not be discussed here. 

The main focus of this paper is on SQP-type methods which basi- 
cally use Newton’s algorithm in order to solve the first order necessary 
conditions. This scheme leads to a linear boundary value problem for 
the state and adjoint variables. It is the size of the discretized linear 
boundary value problem that motivates a variant of this approach in 
the first place: The reduced SQP method, which has been the subject of 
the following papers: [5] introduces reduced Hessian methods in Hilbert 
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spaces. [4] studies various second-order methods for optimal control of 
the time-dependent Navier-Stokes equations. [2] and [3] discuss algo- 
rithms based on inexact factorization of the full Hessian step (11) which 
involve the reduced Hessian (or approximations thereof) in the factors. 
[1] examines preconditioners for the KKT matrices arising in interior 
point methods, also using reduced Hessian techniques. 

This paper is organized as follows: In Section 1, the class of semilinear 
second order parabolic partial differential equations is introduced with 
control provided in distributed fashion. Section 2 covers optimal control 
problems for these PDEs and establishes the first order necessary con- 
ditions. Section 3 describes the basic SQP method in function spaces 
(also called the Newton- Lagrange method in this context), that can be 
used to solve these conditions. The reduced Hessian method is derived 
as a variant thereof. It will be seen that this method is applicable only if 
the linearized PDE is uniquely solvable with continuous dependence on 
the right hand side data. The purpose of the reduced Hessian method is 
to significantly decrease the size of the discretized SQP steps. The asso- 
ciated algorithm which requires the repeated solution of the linearized 
state equation and of the corresponding adjoint is presented in detail. 
In Section 4, this procedure is applied to a system of reaction-diffusion 
PDEs. Finally, numerical results are given in Section 5. 

While the ideas and algorithm are worked out for distributed control 
problems throughout this paper, boundary and mixed control problems 
can be treated in the very same manner with only minor modification 
of notation. 

1. Semilinear Parabolic Equations 

Let be a bounded domain in with sufficiently smooth boundary 
r and (5 = X (0,T), S = P x (0,T) with given final time T > 0. 
We consider semilinear parabolic initial-boundary value problems of the 
following type: 

yt{x, t) -h A{x)y{x, t) + n(x, t, y{x, t),u{x, t)) == 0 in Q 

dny{x,t) + b{x,t,y{x,t)) = 0 on S (1) 
y{x,0) - yo{x) = 0 on fi. 

The elliptic differential operator A{x)y = — Dj{aij{x)Diy) is rep- 

resented by the matrix A{x) = {aij{x)) G which is assumed to be 
symmetric, and dny{x, t)=n{xY A{x)Vy{x, t) = aijni{x)Djy{x, t) 

is the so-called co-normal derivative along the boundary P. When A 
is the negative Laplace operator —A, A gives the identity matrix and 
dny{x^t) is simply the normal derivative or Neumann trace of y{x^t). 
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Questions of solvability, uniqueness and regularity for non-linear PDEs 
shall not be answered here. Please refer to [7] and the references cited 
therein. We assume that there exist Banach spaces Y for the state, U 
for the control and Z for the adjoint variable such that the semilinear 
parabolic problem (1) is well-posed in the abstract form 

e(y,u) = 0 with e :Y xU Z' (2) 

where Z' is the dual space of Z. The operator e may represent a strong 
or weak form of the state equation (1). Casting the PDE in this con- 
venient form will allow us later to view the control problem as a PDE- 
constrained optimization problem and hence support a solution approach 
based on the Lagrange functional. However, in the detailed presentation 
of the algorithms, we will return to interpreting the operator e and its 
linearization 6y as time-dependent PDEs. 

2. Optimal Control Problems 

In the state equation (1), the function u defined on Q is called the dis- 
tributed control function. A Neumann boundary control problem arises 
when, instead of a control function v is present in the boundary 
nonlinearity b{x^t,y{x^t)^v{x^t)). Other possibilities include Dirichlet 
boundary control or even combinations of all of the above. Examples of 
boundary control problems can be found, e.g., in [3] and [1]. Everything 
presented in this paper can be and in fact has been applied to boundary 
control problems with only minor modifications. 

The core of optimal control problems is to choose the control function 
u E 17 in order to minimize a given objective function. In practical 
terms, the objective can, e.g., aim at energy minimization or tracking a 
given desired state. 

We shall use the objective for the distributed control case from [7]: 

f{y,u)= / (p{x,y{x,T))dx + / g{x,t,y,u)dxdt (3) 

Jn Jq 

where cp asseses the terminal state and g evaluates the distributed control 
effort and the state trajectory in (0,T). 

The abstract optimal control problem considered throughout the rest 
of this paper can now be stated: 

Minimize /(y,?/) over (y^u) eY xU 

s.t. e{y^u)=^0 holds. (4) 

A particularly simple situation arises when the state equation (1) is in 
fact linear in (y,u) and the objective (3) is convex or even quadratic 
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positive definite. However, in the general case, our given problem (4) to 
find an optimal control u and a corresponding optimal state y minimizing 
(3) while satisfying the state equation e(y, u) = 0 G is a non-convex 
problem. We will not address the difficult question of global optimal 
solutions but rather assume that a local optimizer [y^u) exists. The 
following first order necessary conditions involving the adjoint variable 
A are well-known, see, e.g., [7] (with —A instead of A): 



-Xt 4- A{xyX + Uy{x, t, y, u)X + gy{x, t, y,u) =0 in Q 

dnX + by{x^t^y)X = 0 on S 
\{T) + cpy{x,y{T)) =0 in 0 
gu{x,t,y,u) +nu{x,t,y,u)X = 0 in Q 
yt + A{x)y + n{x,t,y,u) =0 in Q 
dny + b{x,t,y) = 0 on E 
y(0)-yo(^)=0 on Q. 



(5) 



These can be derived by constructing the Lagrangian 



II 


) + {e{y,u),X)z>,z 


(6) 


and evaluating the conditions 


Ly{y,u,\)=Q in 


Y' (adjoint equation) 


(7) 


Lu{y,u,X)=0 in 


U' (optimality condition) 


(8) 


.3 

O 

II 

II 


Z' (state equation) 


(9) 



in their strong form. 

Triplets (y, fi. A) that satisfy the first order necessary conditions are 
called stationary points. Obviously, the conditions (5) or (7)-(9) consti- 
tute a non-linear two-point boundary value problem involving the non- 
linear forward equation (initial values given) for the state y and the linear 
backward equation (terminal conditions given) for the adjoint A. In the 
next section we introduce an algorithm to solve this problem. 



3. SQP Algorithms 

As we have seen in the previous section, finding stationary points 
(y, fi. A) and thus candidates for the optimal control problem requires 
the solution of the non-linear operator equation system (7)-(9). This 
task can be attacked by Newton’s method that is commonly used to 
find zeros of non-linear differentiable functions. 
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Suppose that we are given a triplet (y^, A^), the current iterate. 

The Newton step to compute updates (^y, 5u^ SX) reads 



■Lyy{y\u\\^) Ly^{y\u\X^) ey{y\u^y- 




'6y' 




■Ly{y^u^X>^)- 


L^y{y\u\X’^) Luu{y\u\X'^) euiyWr 




6u 


= - 


Lu{y^u^X'^) 


. ey{y'^,u'^) euiy^u^) 0 








. e{y\u’^) . 







-y/- 




fuiy^u^) + eu{y\u^)*X^ 


in 


U' 


(10) 


e{y\u^) 




.z'. 





This method is referred to as the Newton- Lagrange algorithm. It falls 
under the category of SQP solvers since (10) are also the necessary con- 
ditions of an auxiliary QP problem, see, e.g., [6]. Note that in contrast 
to the so-called Newton approach (cf. [4]), the iterates {y\u^) of the 
SQP method are infeasible w.r.t. the non-linear state equation, i.e. the 
method generates control/state pairs that satisfy the PDE (1) only in 
the limit. 

The operators appearing in the matrix on the left hand side (the 
Hessian of the Lagrangian) deserve some further explanation. First it 
is worth recalling that the first partial Prechet derivative of a mapping 
g \ X\ X X 2 Y between normed linear spaces X = X\ x X 2 and Y 
at a given point x = (xi^X 2 ) G X is a bounded linear operator, e.g., 
9xi{x) ^ Li{Xi^Y). Consequently, the second partial Prechet derivatives 
at X are gx^xAx) G £(Xi, £(Xi, y)), gx^x 2 {x) G C{X 2 , C{Xi,Y)), etc. 
They can equivalently be viewed as bi-linear bounded operators, e.g., 
the latter taking its first argument from X 2 and its second from X\ and 
mapping this pair to an element of Y . 

The adjoint operators (or, precisely speaking, conjugate operators) 
appearing in the equation (10) can most easily be explained by their 
property of switching the arguments’ order in bilinear maps: 

eyivW) € C{Y,Z') 

ey{y\ v!^)* € C{Z", Y') ^ C{Z, Y') since 2" Z" 
ey{y^,u'^)*{z,y) = ey{y'',u^){y,z) for all y€Y,zeZ. 

Exploiting the fact that the adjoint variable A appears linearly in the 
Lagrangian L, the Newton step (10) can be rewritten in terms of the 
new iterate A^”^^ rather than the update 6X. For brevity, the arguments 
(y^, A^) will be omitted from now on: 



Lyy Lyu Cy 




6y 




fy 


^uy ^uu 




6u 


— — 


fu 


1 

o 
1 








_ e _ 



( 11 ) 
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As can be expected, this system (obtained by linearization of (7)-(9)) 
represents a linear two-point boundary value problem whose solution is 
now the main focus. 

To render problem (11) amenable for computer treatment, some dis- 
cretization has to be carried out. Inevitably, its discretized version will 
be a large system of linear equations since it ultimately contains the 
values of the state, control and adjoint at all discrete time steps and 
all nodes of the underlying spatial grid. Thus, one seeks to minimize 
the dimension of the system by decomposing it into smaller parts. The 
reduced Hessian algorithm is designed just for this purpose: 

Roughly speaking, it solves the linear operator equation (11) for Su 
first, using Gaussian elimination on the symbols in the matrix. A pre- 
requisite to this procedure is the bounded invertibility of e^(y,u) for 
all (y,u) which are taken as iterates in the course of the algorithm. 
In other words, the linearized state equation ey{y,u)h = f has to be 
uniquely solvable for h (with continuous dependence on the right hand 
side / G Z') at these points (y, u). One obtains the reduced Hessian step 



(^CuCy LyyCy e^ii + Luu LuyOy ^yu) 

~ ^u^y [fy ~ ^yy^y ~ fu ^uy^y 6 
CySy— — e — 6uSu 
e* = - fy - Lyy6y - Lyu6u. 



( 12 ) 

(13) 

(14) 



The operator preceding Su is called the reduced Hessian Hsu in contrast 
to the full Hessian matrix H appearing in (11). Note that both the full 
and the reduced Hessian are self-adjoint operators. After discretization, 
the reduced Hessian will be small and dense, whereas the full Hessian 
will be large and sparse. Aiming at solving a discretized version of (12) 
using an iterative solver, the action of the reduced Hessian on given el- 
ements Su E U has to be computed, plus the right hand side of (12). 
It can be shown that once an approximate solution Su to (12) is found, 
the remaining unknowns Sy and obeying (13) and (14) can be ex- 
pressed in terms of quantities already calculated. The overall procedure 
to solve (7)-(9) applying the reduced Hessian method on the inner loop 
decomposes nicely into the steps described in figure 1 using the auxiliary 
variables eY and /i 2 , G Z. 

In many practical cases, the objective and the PDE separate as 



= fi{y) + f 2 {u) and e{y,u) = ei(y) + e 2 (u) (15) 

which entails Luy — Lyu = 0. 

We observe that for the computation of the right hand side b as well 
as for every evaluation of H§yf3^ h is required to solve one equation in- 
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Reduced SQP Algorithm 

1 Set k = Q and initialize ^ A^). 

2 Solve 

(a) 6 yhi = e 

(b) Gyhj2 — fy L/yyh\ 

and set b ~ e*/i2 — fu + Luyh\. 

3 For every evaluation of inside some iterative solver 

of HsuSu = 6, solve 

(a) Cyhs = 

(b) Cyh/\^ Lyyh^ Ly 2 l[A 

and set := Cy^h/^ Ly^yh^ -h Lyy\Z\. 

4 Set := + 5 u. 

5 Set 6 y —h\ — hs and y^~^^ := y^ + Sy. 

6 Set ~ -h2 + /i4. 

7 Set A; A; + 1 and go back to step 2. 

Figure 1. Reduced SQP Algorithm 



volving Cy and another involving e*. It will be seen in the sequel that 
in our case of e representing a time-dependent PDE these are in fact so- 
lutions of the linearized forward (state) equation and the corresponding 
backward (adjoint) equation, see figure 2 in the following section. 

Note that the linear system involving the reduced Hessian H§y is 
significantly reduced in size as compared to the full Hessian of the La- 
grangian, the more so as in practical applications, there are many more 
state than control variables. 

4. Example 

As an example, distributed control of a semilinear parabolic system 
of reaction-diffusion equations will be discussed. The PDE system de- 
scribes a chemical reaction ( 7 i + C2 -> C3 where the three substances 
are subject to diffusion and a simple non-linear reaction law. 
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While in the discussion so far only one (scalar) PDE appears, the gen- 
eralization to systems of PDEs is straightforward. In the example, the 
state y = (ci,C 2 ,C 3 )^ as well as the adjoint A = (Ai,A 2 ,A 3 )^ now have 
three scalar components while the control is still one-dimensional. The 
linearized systems occuring in the computation of the auxiliary variables 
/ii, . . . , /14 feature a coupling between their components which is gener- 
ated by the non-linearity in the state equation (16). Also note that this 
example satisfies the separation condition (15). 

The reaction-diffusion system under consideration is given by 

Ci^ Di Aci - ki C 1 C 2 dnCi = 0 Ci(0) == cio 

C2t ^ D 2 Ac2 - k2 C 1 C 2 + U dnC2 = 0 C2(0) == C 20 (16) 

C 3 ^ Ds Acs + h C 1 C 2 dnCs = 0 cs{0) = cso 

where the control acts only through component two. The boundary 
conditions simply mean that the boundary of the reaction vessel is im- 
permeable. The constants Di and ki are all non-negative and denote 
diffusion and reaction coefficients, respectively. 

The objective in this case is a standard least-squares-type functional 

f{y,u)= / [ci{x,T) - Cid]^ dx + ') / u{x,tfdxdt 
Jq Jq 

in order to minimize the distance of component one’s terminal state 
ci(rr,T) to a given desired state c\d while taking control cost into ac- 
count, weighted by a factor 7 > 0. In case one is interested in maximum 
product yield, the term —f^cs(x,T)dx can be inserted into the objec- 
tive. 

The individual steps in the reduced Hessian algorithm for this particu- 
lar example are given in figure 2. There the vector (cf, C 2 , c§, A^, A| A|)^ 

denotes the current iterate. It stands out that the linear systems for 
hi, . . . ,h 4 can equivalently be written as 



hi^ + Khi = fi for i G {1, 3} 

-hjt + = gj for j e {2, 4} 



where the operator matrix 



k ^ 



’ —DiA — k\C2 

-k2cl 

hC2 



—kiCi 0 

-D 2 A - k2c\ 0 
hc\ -DsA. 



(17) 

(18) 



(19) 



is non- symmetric. Please notice that this phenomenon does not occur 
in scalar PDE control problems. 
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Reduced Hessian steps for the reaction-diffusion example 

Solve for hi — (hu , hi 2 , his)^ : 

hilt - Di^hii - kiC2hii - kic\hi2 = AcJ - kic\cl 

hi2t ~ -C^2 A/ii2 — k2c\hl2 — k2C2hii = C2t — Z^2 Ac2 — k2c\c2 + 

hi3t ~ DsAhis -f ksc^hii + ksCihi2 = C3^ — DsAcs + kscic^ 

dnhii = 0 dnhi2 = 0 dnhis = 0 

^11 ( 0 ) = Ci( 0 ) — cio ^12(0) = 02(0) — C20 hi3{0) = 03(0) — C30 

Solve for /i 2 = (/i 2 i, /^ 22 , /i 23 )^: 

— h21t — DiAh21 — klC2h21 — k2C2h22 4 - kzC2h23 = 921 

— h22t ” D2Ah22 — k2Cih22 — kiCih21 4 - ksCih 23 = 922 

— h23t ~~ D3Ah23 — 0 

921 = —kihi2\i — k2hi2\2 ~ kzhi2\3 

922 = —kihiiXi — k2hii\2 — k3hii\3 

dnh21 = 0 dnh22 — 0 ^n^23 = 0 

h2i{T) = 2[c\{T) - cid - hii{T)] /i 22(T) = 0 /i23(T) = 0 

Set b = —h 22 — . 

Solve for hs = (/ 131 , /^ 32 , /^ 33 )^: 

hsit — DiAhsi — kic^hzi — kic\h32 = 0 
h32t — D2Ahs2 — k2Cihs2 — A^2C2^31 = — n 
hs3t — D3Ah33 — /C3C2/I3I — k3c\h32 = 0 

dnh3i = 0 dnh32 = 0 ^n/l33 = 0 

/i3l(0) = 0 /i32(0) = 0 h33{0) — 0 

Solve for /i 4 = (/i 4 i, /i 42 , ^^ 43 )^: 

— h4it — DiAh4i — /C 1 C 2 / 14 I — k2(^h42 — k3C2h43 = 941 

— h42t ~~ D2Ah42 — /C2Ci/l42 “ AjiCi/l4l — feCi/l43 = p42 

— h43t ~ D3Ah43 = 0 

941 = kih32\i 4- k2h32\2 4- k3h32\t 

942 = kih3iXi 4- k2h3iX3 4- ^^ 3/^31 A 3 

dnh41 — 0 dnh42 ~ 0 ^n^43 ~ 0 

h41 (T) = 2/131 (T) h42 (T) = 0 h43 (T) = 0 

Set HsuCl —h42 4- 270 . 



Figure 2. Reduced SQP Algorithm for the Reaction-Diffusion Example 
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5. Numerical Results 

In this section, results obtained from an implementation of the re- 
duced Hessian algorithm will be presented. All coding has been done 
in Matlab 6.0 using the PDE toolbox to generate the spatial mesh and 
the finite element matrices. The performance of the reduced Hessian 
algorithm will be demonstrated in comparison to an iterative algorithm 
working on the full Hessian of the Lagrangian H given in (11). 

To this end the convergence behavior over iteration count of one par- 
ticular SQP step (corresponding to steps 2 and 3 in the algorithm) will 



be shown. For the tests we chose 
c\{x, t) = 0.5 cio(rE) = 0.1 + X{xi>0.3}(2;) t) = 0 

4(x,t) = 0.5 C2o(a;) = 0.1 + X{j.2>o.3}(a^) A^(a:,i)=0 

c^{x,t) = 0.5 C 3 o(a:) = 0 A3(a;,i)==0 

cid{x) =0 Di = 0.01 ki — 0.5 

u’^{x,t)=0 D 2 = 0.05 k 2 = 1.5 

7 = 1 I>3 = 0.15 ks = 2.5 



on some finite element discretization of the unit circle O C where 
XA denotes the indicator function of the set A fl The final time was 
T- 10. 

As was seen earlier in equation (11), there are three block rows in 
iJ, corresponding to the linearizations of the adjoint equation, the op- 
timality condition and the state equation, respectively. For our tests, 
these have been semi-discretized using piecewise linear triangular finite 
elements in space. The ODE systems obtained by the method of lines 
are of the following form: 

My + Ky = f (forward equations) (20) 

—MX + K^X = g (backward equations) (21) 

They were treated by means of the implicit Euler scheme with constant 
step size. Of course, suitable higher order integrators can be used as well. 
Using this straightforward approach yields one drawback that becomes 
apparent in figure 3: The discretized full Hessian matrix H is no longer 
symmetric, although the continuous operator H is self-adjoint. The same 
holds for the discretized reduced Hessian 

This is due to the treatment of initial and terminal conditions in 
the linearized state and the adjoint equation. Nevertheless, there are 
methods that reestablish symmetry, but these will not be pursued in 
the course of this paper since qualitatively, the convergence results re- 
main unchanged. For that reason, the non-symmetry will be approved. 
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Figure 3. Non-symmetry of dis- 
cretized full Hessian, nt — A time 
steps, implicit Euler, dotted lines in- 
dicate blocks corresponding to (11) 



thereby waiving the possibility to use, e.g., a conjugate gradient method 
to solve the reduced problem but relying on iterative solvers capable of 
non-symmetric problems. In the tests, GMRES has proved quite efficient 
on the full Hessian problem while CGS and BICGSTAB failed to generate 
reasonably better iterates than the initial all-zero guess. For the reduced 
Hessian, all three algorithms found the solution to high accuracy, and 
CGS needed the fewest iterations to do so. As a common basis, GMRES 
with no restarts was used for both the full and the reduced Hessian 
problem. 

Note that while the discretized state and adjoint allocate nt (equal 
to 4 in figure 3) discrete time steps, the discretized control needs only 
nt — 1. This is attributed to the use of the Euler method where, after 
discretization, u{t — 0) does not appear in any of the equations. 

In order to illustrate the convergence properties, it is convenient to 
have the exact discretized solution of the full SQP step 

(11) at hand. To that end, the full Hessian matrix was set up explicitly 
for a set of relatively coarse discretizations, and the exact solution was 
computed using a direct solver based on Gaussian elimination (Matlab’s 
backslash operator). The exact solution 6u of (12) was obtained in the 
same way after setting up the reduced Hessian matrix, where the corre- 
sponding 5y and were calculated performing the forward/backward 
integration given by (13) and (14). These two reference solution triplets 
differ only by entries of order lE-15 and will be considered equal. 

It has to be mentioned that for these low-dimensional examples (cf. ta- 
ble 1), a direct solver is a lot faster than any iterative algorithm. How- 
ever, setting up the exact reduced Hessian matrix of course is not an 
option for fine discretizations. 

Figures 4-6 illustrate the convergence behavior of GMRES working on 
the reduced versus the full Hessian matrix: For (^Uref denoting the exact 
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discretized solution, the graphs show the relative error history 

in the L?' norm, where Su^ (t) denotes the approximate solution generated 
by the iterative solver after j iterations, taken at the time grid point 
t G [0,T]. The same relative errors can be defined for 5u substituted 
by Jci, . . . , 5 c 3 or . . . , which are the components of the state 
update Sy and the new adjoint estimate 

Each figure shows the relative error history ej{t) of either 6u or Sci 
obtained using GMRES with no restarts after j = 4, 8, . . . , 28 iterations on 
the reduced problem and after j = 100, 200, . . . , 600 iterations on the full 
problem. The figures for 5c2^ Scs and . . . , A^"^^ look very much the 
same and are not shown here. The discretization level is characterized 
by the number of discrete time steps nt and the number of grid points 
in the finite element mesh poi. Table 1 lists the number of optimization 
variables in the full and reduced case for the individual discretizations 
used. 



nt 


poi 


# of vars (reduced) 


# of vars (full) 


9 


25 


200 


1550 


9 


81 


648 


5022 


19 


81 


1458 


10692 



Table 1. Number of optimization variables for different discretizations 



It can clearly be seen that the iterative solver works very well on the 
reduced system while it needs many iterations on the full matrix. This 
was to be expected since it is a well-known fact (see, e.g., [1] and [2]) 
that iterative solvers working on the full Hessian require preconditioning. 
Although the evaluation of times a vector is computationally more 
expensive than H times a vector, the reduced Hessian algorithm is by far 
the better choice over the unpreconditioned full algorithm. To give some 
idea why the reduced Hessian algorithm outperforms the full Hessian 
version, let us define 



P = 



P'u^y J- -^yy^y 

0 0 / 

7 0 0 



(23) 
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Figure 4- Relative error history for 5u (left) and Sci (right) on the reduced (solid 
lines) problem for j = 4, 8, . . . , 28 iterations and on the full (dotted lines) problem for 
j = 100, 200, . . . , 600 iterations at discretization level nt = 9, poi = 25 
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Figure 5. Relative error history for 6u (left) and Sci (right) on the reduced (solid 
lines) problem for j = 4, 8, . . . , 28 iterations and on the full (dotted lines) problem for 
j = 100, 200, . . . , 600 iterations at discretization level nt = 9, poi = 81 



as the left preconditioner for the full Hessian problem (11) with the first 
two columns permuted (for simplicity, the separation condition (15) is 
assumed to hold): From (11), we get 



r L e* 1 

^yy 




Su 




fy 


L p* 

-^UU 




6y 


= -P 


fu 


_ ^u 








_ e _ 



P 



(24) 
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Figure 6. Relative error history for 8u (left) and Sci (right) on the reduced (solid 
lines) problem for j = 4, 8, . . . , 28 iterations and on the full (dotted lines) problem for 
j = 100, 200, . . . , 600 iterations at discretization level nt = 19, poi = 81 



which is equivalent to the block-triangular system 

6u Gy 

L p* 

L -^yy 

whose rows are just the equations (12)-(14). Hence the reduced Hessian 
problem is nothing else than the full problem after preconditioning with 
P. Comparing (11) to (25), it turns out that the preconditioning actually 
provides the iterative solver with some insight into the interdependence 
of the unknown variables. While in the full Hessian system, the solver 
takes all variables as degrees of freedom, in the reduced system only 
the true free variables (i.e. the controls) appear and the state and the 
adjoint are calculated consistently. From this point of view, the reduced 
Hessian method resembles what is usually called a direct single shooting 
approach, applied to a linear-quadratic model. 

The necessity to have the full and reduced Hessian matrix explicitly 
available for the numerical tests limits the discretization levels to very 
coarse ones throughout this paper. In practice, however, control prob- 
lems for time-dependent PDFs with about 275 000 unknowns (including 
40 000 control variables) have been successfully solved on a desktop PC 
within 2 hours using the reduced Hessian SQP algorithm. 
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Abstract The regularity of solutions of parabolic initial-boundary value problems 
directly depends upon the regularity of the boundary data. Reduced 
regularity of boundary data arise e.g. in optimal boundary control prob- 
lems governed by evolution equations by a discretization of the control 
by piecewise constant functions and results in refined grids if automatic 
step size procedures in time are applied. In the present study effects 
to numerical methods for solving the state equations are illustrated. 
Moreover, an appropriate splitting of the solution is used to improve 
the numerical behavior of the discretization technique as well as of the 
optimization method applied to the control problem itself. 

Keywords: Boundary control, parabolic equation, discretization. 

1. Introduction 

Smoothness properties of solutions of parabolic initial-boundary value 
problems directly depend upon the smoothness of initial and boundary 
data. As a consequence, discretizing the boundary control by piecewise 
given functions generically results in a reduced smoothness of solutions of 
the related state equations. However, the efficiency of numerical meth- 
ods for partial differential equations depends on the regularity of the 
desired solution. This yields specific effects like severe local grid refine- 
ments in time when standard discretization techniques are applied. In 
the present paper we investigate such effects and in case of piecewise 
constant Dirichlet controls we use a splitting of the solution to improve 
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the numerical behavior of the discretization technique as well as of the 
optimization method applied to the control problem. 

Throughout the paper we consider spatial one-dimensional boundary 
heat control problems 

1 T 

f [w{x^T;u) — q{x)]‘^ dx + ^ f [u{t) — p{t)]‘^ dt ^ min \ (1.1) 
0 0 
subject to the state equations 



dw _2 d^w 
^ dx‘^ 


= f 


in 


Q-.= (0,1) x(0,T], 


W 


= 0 


on 


r, := {0}X(0,T], 


JDW + 


= u 


on 


r, :={l}x(0,T], 


w 


= 9 


on 


Qo := [ 0 , 1 ] x{ 0 }, 



with control u and the state •;ia) in the weak sense (cf. [11], [12]). 
Here Tat > 0 are given coefficients satisfying 7d +7tv > 0 and a > 0 
is a fixed regularization parameter. Further, [/ 7 ^ 0, C/ C Loo(0,T) 
denotes a set of admissible controls, q G ^ 2 ( 0 , 1) is the given target 
temperature and p E U denotes some fixed reference control. 

For controls u Eli here we restrict ourselves to discretizations E 
defined over a given time grid 

<t^ <■■■ < =T (1.3) 

by piecewise constant functions, i.e. 

u'^ eU^ ^ u^(t) = e R, Vi E t'^], k = (1.4) 

Here denotes the number of time intervals for control discretization. 
To distinguish between discretizations of control and states we indicate 
the first ones by upper scripts (as above) and the second ones by lower 
scripts. 

In case of Dirichlet controls, i.e. = 1, 7at = 0, jumps of at 
inner grid points k = 1, . . . , — 1, cause discontinuities of the re- 
lated solution In literature, there are several results on the 

numerical treatment of the heat equation with irregular solutions, where 
the irregularities result from the functions / or ^ (cf. [2], [ 6 ], [9], [10]). 

In literature one can find three approaches to overcome such difficul- 
ties. In the first one fitted methods are constructed with coefficients 
which are adapted to the singularities (cf.[ 6 ]). In the second approach 
a standard method is chosen but with specifically refined meshes in the 
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neighborhood of singularities (cf. [2] , [9], [10]). The third approach 
splits off the singularities. In the present paper we apply this splitting. 
Related details are discussed in the following section. 



2. Numerical treatment of the state equations 
2.1 Splitting in case of Dirichlet conditions 

In the present subsection we consider the case 7 d = 1, 7 at = 0. At 
first, we assume compatibility at x == 0, i.e. g'(O) = 0, and (for the sake 
of a uniform description) we extend the piecewise constant function 
to t = 0 by i^^(O) vP g{l). 

Let us now introduce functions \ Q — > R, /c = 1, . . . , by 



w^{x^ t) 



^0 , if rc € [0, 1], 4 



with the error function 



( 2 . 1 ) 



erf(0 J e~^^ds, 

0 



^ € M. 



( 2 . 2 ) 



Definition (2.1), (2.2) yields G C'°°(Q\{(l,t^ ’^)}) and 



dw'^ 2 
dt ^ dx^ 



0 in Q. 



Further, has a jump w.r.t. t at (l,t*^“^). Hence, occurring disconti- 
nuities of the solution of (1.2) at the points (l,t^). A: = 0, . . . ,M*^ — 1, 
originated by jumps in u, can be captured by the functions w^. Namely, 
using superposition, the solution w{-, ■ ; u'^) of (1.2) can be written as 



w{x,t;u^) = w{x,t;u^) +v{x,t‘u'^), (x,t) E Q (2.3) 

for any given G C/'^, where w{-,- ■,u'^) is defined by 



M<= 

w{x,t;u'^) := w^{x,t), (x,t) £ Q 



(2.4) 



k=l 



and v{-,- ;u) denotes the solution of the related parabolic problem 

dv _2 d‘^v _ f D 

V = —w on F/, u = 0 on F^, (^-^) 

V = g on Qo- 




258 



Due to == ^(0) = 0 for any G U'^ ^ the smoothness of w at 

X = 0 and sufficiently smooth functions / and the discontinuities of 
w are completely captured by w. Hence, problem (2.5) allows a better 
numerical treatment than the original PDE. 

2.2 Discretization of the state equations 

In the preceeding section we described the principle impact of piece- 
wise discretizations of controls to the smoothness of the solutions of the 
state equations. Now, we sketch consequences of reduced regularity to 
numerical methods applied to (1.2) with discretized boundary data. 

Among the variety of methods let us consider semi-discretization by 
standard method of lines (MOL) as well as full discretization schemes. 
The major difference of both approaches is that in the first one standard 
ODE solvers with efficient step size control can be applied while the full 
scheme provides a direct access to the time grid which will later be ad- 
vantageous in evaluating adjoint states for the optimal control problem. 
Consider some spatial grid {^z}^o interval [0, 1], i.e. 

0 = a;o < < • • • < xn-i < xn = 'i-- ( 2 . 6 ) 

Using simple finite differences we obtain a spatial semi-discretization of 
the PDE by 

dWiu\ ^2 - Wi{t) Wi{t) - Wi-i{t) 

^ 

^i+l/2 f ^) 5 ^ • * • 5 1 

with hi ~ Xi — Xi-i^ i = 1,... AT and hi_^ij 2 '•= {hi -|- hi^i)/2. Here 
and in the sequel Wi denote functions which approximate w{xi^-]u'^). In 
addition to (2.7) the boundary conditions from (1.2) at a; = 1 are taken 
into account by 

lD^N{t) = u''{t), te (0,T] (2.8a) 

and 




hN dwN u\ _ 



KW -jDWN{t)] 



2 WN{t) - WN-l{t) , h 



+ -^/(x7v,t), t G (0,T] 



for = 0 and jn 7^ 0, respectively, while at a; = 0 we have in 
both cases ^^o(^) — 0. If we consider splitting then instead of (1.2) 
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we apply semi-discretization to problem (2.5) and we have vc)(t) = 
-w{0,t;u'^), VN{t) =0. 

Together with the initial conditions 

Wi{0) = g{xi), i = ( 2 . 9 ) 

we obtain an IVP system for the functions Wi. Notice that in case of 
Dirichlet control the number of unknowns is N — 1 otherwise N. We 
will not explicitly distinguish these cases and write for simplicity in the 
sequel just N. 

In our first approach we treat the IVP (2.7)-(2.9) by standard ODE 
codes for stiff IVPs. In particular, in our study we applied BDF-codes 
and trapezoidal rule with automatic step size control. 

Alternatively to semi-discretization and standard ODE codes, to which 
in the sequel we refer shortly as semi-discretization, in a second approach 
we apply implicit Euler method with a fixed time step T/M to (2.7) - 
(2.9), which we denote in the sequel as full discretization. 

In both approaches discrete states are denoted by Wij^ i = 0, 1, . . . , TV, 
j = 0, 1, . . . , M, where M is the number of time steps. 

3. Numerical treatment of the control problem 

3.1 Gradient evaluation 

Discretization of the controls and the state equations leads to an ap- 
proximation of the original optimal control problem (1.1), (1.2) by a 
finite dimensional quadratic programming problem. Let us consider the 
case that no constraints are imposed upon the controls. 

The state equations result in an affine mapping transferring discrete 
controls G into discrete terminal states w.^m^ he. we have 

W.^M — “b CLh,r (^*1) 

with some matrix G , E^) and some vector ^ With 

discrete scalar products (•,•) in and respectively, we obtain 

problems of the type 

-> min ! s.t. G U'^ (3.2) 

with 

Jh,r{u'^) ■= 2i^h,Tu'"-qh,r^^h,TU'-qh,T) + (3-3) 

Here ~ qu ~ CLh,r-) and p'^ G U'^ denotes some approximation of p. 
Further, the necessary optimality conditions are given by 

- 9/1, r) + OC {u^ ~ p'^) = 0. 
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It should be noticed that in case of full discretization 
known, but will not be constructed explicitly because of the dynamic 
nature of the discrete state equations. However, in case of semi-discreti- 
zation where some ODE software code is applied to (2.7)-(2.9) then 
^/i,T depend on various additional features, like built-in automatic 
step size controls. In case of semi-discretization as well as full dis- 
cretization is applied the image can be determined for any E 

by discrete time integration. Moreover, adjoint equations provide 
an efficient tool for gradient evaluations replacing the calculation of 
— Qhr)' For the optimal control problem (1.1), (1.2) the 
corresponding adjoint problem is defined by (cf. [1], [4], [7], [11]) 

dz , _2 _ n • 

z = 0 on r^, 7dz + jn^ - 0 on 

z = w — q on 

and the reduced gradient of the objective at G [7 in direction s G 
Loo(0, T) is given by 

(7^ dz 

J'[u) s = ^ ( 2 ( 1 , • ; n) - —(1, ■■,u) + a{u- p),s), . (3.5) 

7 d + JN dx f 

Notice that after reversing the time orientation the adjoint problem (3.4) 
is of parabolic type as the state equation (1.2). However, unlike in the 
state equation in the adjoint equation we meet incompatibility only at 
one time level, namely t = T. 

For the remaining part of this section we restrict ourselves to the 
case 7 £) = 1, 7at = 0. Further, for simplicity in the sequel we consider 
equidistant spatial grids and denote its step size by > 0. 

When applying standard ODE solvers to the related semi-discrete 
IVP (2.7)-(2.9) and an appropriate discretization to the scalar product 
in (3.5) we obtain the following approximation of the discrete directional 
derivative 



Q, 

[o,iixm 



(3.4) 









s^, (3.6) 



keKi 



where G M, j = 1,...,M^ are the coefficients of G 

t/^, (zij) is the discrete solution of the adjoint problem (3.4), {'&k}^o 
denotes the time grid generated by the applied ODE solver and 

Kj :={k€{l,...,M}: {p-\ t^]}, := j = 1, . . . , 
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To obtain (3.6) from (3.5) besides simple integration, the derivative of 
^ at X = 1 is approximated by one sided finite differences where we take 
into account the boundary condition z{l, •) = 0. Equation (3.6) provides 
the representation of the discrete gradient via the adjoints. 

In case of full discretization, the discrete gradient can be evaluated 
directly via the corresponding discrete adjoint system. Similarly to the 
continuous adjoint system after time reversal it turns out to be an im- 
plicit Euler scheme again. The obtained formula for the discrete gradient 
(3.6) can be also interpreted as an approximation of the continuous one. 

Our numerical experiments confirmed the fact that the discrete ad- 
joints of the full discretization lead to exact gradients as generated in 
automatic differentiation tools (see [3]). However, if software tools are 
applied to semi-discretization of (2.5) and of the adjoint equations (3.4) 
then only an approximation of the gradients is obtained. One reason for 
that deviation is that applications of ODE solvers with time step con- 
trol lead to discretizations of the states and adjoint states with different 
time grids. Thus the discretization of the adjoint states is not adjoint 
to the discrete states in the sense of the discrete L 2 -norm but only an 
approximation. Moreover, the summation in the formula for the discrete 
gradient (3.6) causes a further amplification of the error. Hence, to guar- 
antee convergence of optimization techniques based on this approach a 
sufficiently high order of accuracy in the applications of ODE software 
is required which becomes rather expensive for fine discretizations. 

3.2 Selected minimization techniques 

Since the gradient can be obtained quite easily via adjoint states con- 
jugate gradient methods as well as quasi-Newton techniques (e.g. Broy- 
den’s symmetric update, DFP-method, . . . ) are appropriate for solving 
the discrete quadratic minimization problem (3.2). 

To make the paper self contained we describe briefly the major steps of 
methods used in our tests for solving (3.2). Let us denote the elements of 
a sequence {u^} of discrete controls by := £U'^ . In the considered 

piecewise constant approximation we can represent by its coefficients 
GM, A; = 0,1,...,M^ / = 0,1,... . 

As one of the methods of choice we applied conjugate gradient meth- 
ods. Starting with some G U'^ and /3q := 0, these methods generate 
a minimizing sequence {u^} C U'^ recursively by 



J ._ 



-r(u') + As'“‘. ft+i- 



IIA'M 

J-lM 



U 



JU2 

i+1 — jji ^ a;s^, with Cauchy step size ai > 0. 



(3.7) 
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CG methods terminate with the optimal control given by the final min- 
imizer in a finite number of steps provided exact function and gradient 
evaluations are applied and no rounding errors occur. This is, however, 
unrealistic in the problems under consideration but the convergence can 
be accelerated by appropriate preconditioning (cf. [5], [8]). We detected 
that in the case of Dirichlet control the analytic solution (2.4) which 
captures jumps in boundary data serves for preconditioning. 

In case of unconstrained controls the Cauchy (i.e. minimizing) step 
size a is easily obtained. However, penalty methods for the treatment 
of constraints require additional step size procedures. 

As other methods of choice we included quasi-Newton methods into 
our study. Their basic idea is to define the search direction at by 

= -J/, (3.8) 

where J/ := J'(u^) and Hi^ I — 0, 1, . . ., denote matrices satisfying the 
related quasi-Newton equation 

Hi (u' - u'-i) = J'l - J/_i, / = 1, 2, . . . . (3.9) 

Starting with the identity Hq ~ I the matrices Hi are updated by 
appropriate formulas. In particular, we considered Broyden’s symmetric 
update. Let 

r'+i := J'l+i-J'i - 

Then the new matrix is defined by 

«+■ = ■»' + (3-10) 

In the evaluation := + 0 (/s^ the step size > 0 has been selected 

according to a simplified Armijo rule. For a detailed description of CG- 
methods and quasi-Newton methods we refer e.g. to [5], [8]. 

Occurring constraints 



< 1, i = l,...,M^ 

on controls have been included by the penalty term {p > 0) 

■= y(iii + 1)2 +p + y'(uJ - 1)2 + P - 2 . (3.11) 

j=i L 

For p -> 0+ this tends uniformly to the well-known non-smooth penalty 
Po('^^) := c |^max{0, —u^ — 1} -f max{0,tA'^ — 1} , 
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which is exact for sufficiently large constant c > 0. For p > 0 the 
penalty Pp is infinitely often differentiable. This forms an advantage in 
comparison with loss functions. Further, unlike for barriers the values 
Pp{u^) are finite for any discrete control . For the first derivative and 
the Hessian we have 



+ I 

+ iy + p 



— 1 

— ly + p 



and 




— diae- i ^ ^ ^ 

2 ^ I L((^^‘ + 1)" + - I? + pf/^ 




respectively. These derivatives have been used directly in the quasi- 
Newton methods, i.e. only components related to J(-) are taken into con- 
sideration by the quasi-Newton update. On the other hand, in Armijo’s 
step size rule only penalty terms have to be repeatedly evaluated due to 
the quadratic nature of J(-). This accelerates the code compared to an 
application of an all-purpose minimization routine. 



4, Numerical experiments 
4.1 Preliminaries 

In our numerical experiments we tested the performance of different 
techniques applied to IB VPs (1.2) with discontinuous boundary data as 
well as studied effects in connection with boundary control problems of 
tracking type. All experiments are implemented in MATLAB. The focus 
in Examples 1, 2 was directed towards the behavior of automatic step 
size procedures in ODE codes and to an improvement of the efficiency 
of such codes by using the splitting described in subsection 2.1. In con- 
nection with optimal control in Examples 3, 4 we studied the influence 
of discontinuities in boundary data on the convergence of minimization 
techniques. 

In all examples we choose equidistant grids xi — i/N and = 
{Tk)/M^. Further, in the first two examples we choose a = 1/2, T = 1, 
but in the last two cr = 1, T = 0.1. 

The following tables and figures report on numerical results obtained 
by the BDF-code odelSs (option BDF— on) using several maximal or- 
ders of consistency (option MaxOrder) and trapezoidal rule ode23t, 
respectively. If not written otherwise, the default values of the relative 
and absolute error tolerance RelTol^le— 3 and AbsTol=le— 6, respec- 
tively, are used. Further, in the Dirichlet case = 1? 7 w = 0) we split 
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the experiments in direct solving problem (1.2) by the method of line 
(named ’direct’ in the following tables) and in applying superposition 
(2.3) to treat occurring jumps in boundary data. In the latter case we 
solve numerically the remaining smooth problem (2.5). All described 
effects depend on and the height of the jumps. 

4.2 Example 1 (state equations) 

For the first example we choose / = 0, ; ^ = 0, = 3, N = bO 

with boundary data in (1.4) according to [u^] = (1,— 2,3)^. Fig. 
1 shows the obtained solution w{x^t]u^) for Dirichlet and Neumann 
boundary conditions, respectively. The number of required time steps is 




Figure 1. 7 ^) = 1, = 0 and 7 d = 0, 7 at = 1 

reported in Tab. 1. For the trapezoidal rule code the related results are 
marked with T instead of the order as done for BDF code. 



treatment 


1 direct 


1 superposition 


maximal order 


1 


2 


5 


T 


1 


2 


5 


T 


obtained time steps 


2331 


518 


322 


377 


364 


111 


58 


81 



Table 1. Comparison of different approaches 



The left two graphs in Fig. 2 illustrate the behavior of the automatic 
step size control when applied directly or after splitting in case of Dirich- 
let boundary conditions. Further, in the right graph step size results in 
case of Neumann boundary conditions are reported. 

The numerical experiments show (see Fig. 2) that each jump in the 
control reduces the time step size drastically. On the other hand, 
splitting-off the discontinuities (in case of Dirichlet-boundary conditions) 
in advance avoids these time step size reductions and, hence, yields a 
more effective numerical procedure. 
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4.3 Example 2 (state equation, known exact solution) 

In this example we consider a problem with Dirichlet boundary con- 
ditions where the exact solution is known. The required discontinuous 
boundary data are generated by means of the function w introduced in 
Section 2. Unlike in the previous tests here we concentrate on the error 
behavior. Let the exact solution be given by 



w{x,t;u'^) = g{x) - 



w{x^ t\ v7) — (1 — x) tJ)(0, t\ u^) 



with g{x) == lQx‘^{l — x)^^. To study one internal jump only we choose 
= 2 and in (1.4) according to {u^} = (1, — 1)^- 
Fig. 3 shows the obtained solution w{x^ t; u^) and together with Fig. 4 
the error of the BDF-code with MaxOrder=5 for superposition and the 
direct approach, respectively. In the right picture of Fig. 4 the neigh- 
borhood of the point (x^t) = (1,0), where a jump is located, is cut off. 




Figures. Solution w{x,t]u^) and Error for superposition 




266 




Figure 4- Error in case of the direct solution 



Choosing different numbers N of spatial grid points with fixed accuracy 
RelTol=AbsTol=le— 8 we obtain 



N 


1600 


800 


200 


50 


obtained time steps 


1274/133 


1138/135 


894/140 


666/150 


error at t = 1 


9e-07 


3e-06 


6e-05 


9e-04 



Table 2. Comparison of required time steps for different N 



where in the second row of Tab. 2 the first number is related to direct 
treatment, the second to superposition. 

The numerical experiments reflect (see Tab. 2 and Fig. 3,4), that the 
step size reduction is the more severe the larger N is. 

Finally, we notice that the numerical solution converges at t = T, 
although there is no convergence locally near jumps. 

4.4 Example 3 (unconstrained control problem) 

We consider the optimal control problem (1.1), (1.2) with 

p = 0, / = 0, 5 = 0 and q{x) — 0.05 sin(47rx). 

The convergence behavior of a CG-algorithm as well as a quasi-Newton 
method with Broyden’s update is compared for both the approaches dis- 
cussed in Subsection 3.1, i.e. that the calculation of the discrete gradient 
(3.5) is based on semi-discretization with discretized continuous adjoints 
and full discretization with discrete adjoints, respectively. In case of 
semi-discretization superposition is used for the solution of state as well 
as for the adjoint state equations. The remaining regular problems were 
treated by the BDF-code of MATLAB with MaxOrder=5. In Fig. 5 we 
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Figure 5. CG-algorithm and Broyden’s method 



included results from semi-discretization, RelTol=AbsTol=le— 5, semi- 
discretization, RelTol=AbsTol=le— 12 and full discretization, M = 500. 
The related curves are marked by A—, V— and o— , respectively. 

Further, in Fig. 6 the corresponding optimal controls received by the 
CG-algorithm are reported. We indicate that further slow improvements 
were obtained beyond the iteration steps plotted in Fig. 6. 




a) semi-discretization, le— 5 b) semi-discretization, le— 12 c) full discretization 



Figure 6. Optimal control obtained by CG-algorithm 



In Tab. 3 the influence of the control grid is given for full discretiza- 
tion. Semi-discretizations with sufficiently high accuracy in the ODE 
solvers show a similar behavior. In general we remark that additionally 





CG method 


Broyden’s update 


10 


4.38e-03 


4.84e-03 


25 


2.59e-03 


2.78e-03 


50 


1.42e-03 


1.87e-03 


100 


1.40e-03 


1.73e-03 



Table 3. Comparison of convergence behavior for different control grids 



to slower convergence semi-discretization in both cases of accuracy is 
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more expensive, i.e. consumes significantly more computer time, than 
the full discretization. 

4.5 Example 4 (constrained control problem) 

We choose / = 0, S' = 0 . Further, we start with a control problem 
(1.1), (1.2) which possesses the optimal solution 

Uref{t) = 1.5 sin ^1^) , t € [0,T] 

if no constraints are given for the controls. Using this the functions q 
and p are defined by q{x) := w{x^T]Uref) and p := Uref^ respectively, 
with the solution • \ Uref) of the state equation (1.2) for u — Uref- 





semi-discretization 


full discretization 


clipping 


10 


1.41e-03 


1.36e-03 


6.93e-03 


25 


8.83e-04 


2.71e-04 


2.30e-0.3 


50 


3.23e-04 


1.06e-05 


l.lOe-03 


100 


7.28e-04 


9.84e-06 


6.60e-04 



Table 4- Obtained objective values for different control grids 



In Tab. 4 the achieved optimal values are reported for the two ap- 
proaches. In addition, we show in the last column the objective value 
for the discrete control which is obtained from the unconstrained optimal 
one by simple clipping along the constraints. 

The following Fig. 7 shows discrete optimal controls obtained by Broy- 
den’s update (3.10) to the quadratic part (from the state equations) and 
by direct use of up to second order derivatives of the penalties as given 
in Section 3. Further, in Fig. 8 the approximation of the tracked tar- 
get and a comparison between the constrained and the unconstrained 
optimal controls are given. 




The computational experiments showed a very similar behavior as in 
the unconstrained case. In semi-discretization the state as well as the 
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Figure 8. Approximation of the target Constrained, unconstrained control 

adjoint system have to be solved with a sufficiently high accuracy to 
ensure a good approximation of the gradient. This, however, results in 
high time consumption in the applied ODE solver. On the other hand, 
full discretization in more general cases (in particular in higher spatial 
dimensions) requires additional preparatory work compared with the use 
of available software codes. 

5. Conclusions 

Piecewise constant discretization of boundary controls yields a re- 
duced smoothness of the solutions of state equations. In all our con- 
sidered examples this resulted in locally small step sizes if ODE solvers 
were applied to a semi-discretization of the state equations. These prob- 
lems could be avoided by considering in advance a specific splitting of 
the state equations. 

In the examples of optimal control problems semi-discretization was 
only used in connection with a separation of the discontinuities. Hence, 
the ODE solvers were, in fact, applied to the regular subproblem. Nev- 
ertheless, this approach turned out to be more time consuming then 
full discretization combined with discrete adjoints. In addition, full dis- 
cretization often yielded better values of the objectives and proved to be 
faster for comparable accuracy. Further, if lower accuracies were applied 
to speed up the ODE codes in semi-discretization then the optimization 
became slow due to the fact that discretizations of continuous adjoint 
problems lead to only rough approximations of gradients. 
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Abstract A family of parameter dependent elliptic optimal control problems with 
nonlinear boundary control is considered. The control function is sub- 
ject to amplitude constraints. It is shown that under standard coercivity 
conditions the solutions to the problems are Bouligand differentiable (in 
L'® , s < oo) functions of the parameter. The differentials are character- 
ized as the solutions of accessory linear-quadratic problems. 

Keywords: Parametric optimal control, elliptic equation, nonlinear boundary con- 
trol, control constraints, Bouligand differentiability of the solutions 

1. Introduction 

In this paper, we analyse differentiability, with respect to the parame- 
ter, of solutions to a nonlinear boundary optimal control problem for an 
elliptic equation. Our aim is to show that, under a standard coercivity 
condition, the solutions to the optimal control problem are Bouligand 
differentiable functions of the parameter. Let us recall this concept of 
differentiability (see [3, 8, 9]). 

Definition 1 A function from an open set Q of a normed linear space 
H into another normed linear space X, is called Bouligand differentiable 
(or B- differentiable) at a point ho e Q if there exists a positively homo- 
geneous mapping Dh4>{ho) \ Q ^ X, called B-derivative, such that 

(j){ho + Xh) = (j){ho) + Dh(!){ho)Xh + o(|| A/i||/f). (1) 



Clearly, if Dh(j>{ho) is linear, it becomes Frechet derivative. 
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As in [4] and in [5], the sensitivity, i.e., differentiability analysis for the 
original nonlinear problem is reduced to the same analysis for the acces- 
sory linear-quadratic problem. The starting point of the analysis is the 
Lipschitz stability result for the solutions to the linear-quadratic ellip- 
tic problems due to A. Unger [11]. Using this result, B-differentiability is 
proved in two steps. First, passing to the limit in the difference quotient, 
we show the directional differentiability and characterize the directional 
differential as the solution of an auxiliary linear-quadratic optimal con- 
trol problem. Using this characterization, in the second step we show 
that an estimate of the form (1) holds, i.e., the solutions are Bouligand 
differentiable. This result can be considered as a generalization of that 
obtained in [1], where a different methodology was used to prove the di- 
rectional differentiability of the solutions to parametric elliptic problem, 
under the assumption that the cost functional is quadratic with respect 
to the control. 

2. Preliminaries 

Let Q C denote a bounded domain with boundary F. As usu- 
ally, by Ay and dj^y we denote the Laplace operator and the co-normal 
derivative of y at F, respectively. Moreover, let be a Banach space of 
parameters and G C if an open and bounded set of feasible parameters. 
For any h e G consider the following elliptic optimal control problem: 

{Oh) Find {yh^Uh) G {W^^^{n) H C(0)) x L^(F) such that 

F{yh,Uh,h) ^ 

subject to 
-Ay{x) + y{x) 
dyy{x) 

u eU {v E Z/°°(r) I mi < v{x) < m 2 a.e. in F}. (4) 

In this setting, mi < m 2 are fixed real numbers, dSx denotes the surface 
measure induced on F, and the subscript x indicates that the integration 
is performed with respect to x. We assume: 

(Al) The domain has C^’^-boundary F. 

(A2) For any h e the functions (^(•, h) : M '0(-, •, /i) : MxM ^ 

IR and b{-^-,h) : M x M M are of class C^. Moreover, for any 
fixed u E IR and h £ G^ b{’^u^h) : IR ^ M is monotonically 



min{F(y, w, /i) 



/ (p{y{x),h)dx + / ^j{y{x),u{x),h)dSx} 
Jn Jr 



( 2 ) 



= 0 in li, 

b{y{x)^u{x)^h) on F, 



( 3 ) 




273 



Solutions Differentiability of Parametric Optimal Control 

decreasing. There is a bound cq > 0 such that 

|6(0,0,/i)| + |%,„)6(0,0,/*)| + \Dfy^^)b{0Ah)\ <CG V/i G G. 

Moreover, for any K > 0 there exists a constant 1{K) such that 

\Dfy,u)Hyi^ui,h) - Dfy ^^^b{y2,U2,h)\ < KK){\yi~y2\ + - W 2 I) 

for all Hi^Ui such that \yi\ < \ui\ < and all h G G. The same 
conditions as above are also satisfied by ip and 

(A3) The functions 6(y, •), Dyb{y^ u, •) and Dub{y^ •) are Prechet dif- 

ferentiable in h. Similar properties posses functions ip and 

By the following lemma, proved in [6], problem {Oh) is well posed. 

Lemma 1 If (Al) - (A3) hold, then for any u ^ U and any h ^ G 
there exists a unique weak solution y{u^h) E fl C(0) of (3). 

Moreover, there exists c > 0 such that 

\\y{u', h') - y{u", /i")|lc(n) < c(||n' - u"||loo(d + \\h' - (5) 



Define the following Hamiltonian and Lagrangian 

n-.M^ xG-^M, C: x L°°(F) x x G ^ M, 

n{y,u,p,h) ■■= i’{y,u,h) +pHy^u,h), ( 6 ) 

C{y,u,p,h) ■=F{y,u,h)- / p{-Ay + y)dx 

Jq 

= [ My, h) - (Vp, Vy) - (p, y)]dx + [ %{y, u,p, h)dSx. (7) 
Jn Jr 

We assume: 

(A4) For a given reference value ho E G of the parameter, there exists 
a local solution (yo^^o) ^ of {Oho) associated state 

Po G n C(0), such that the following first-order necessary 

optimality conditions hold 

DyC{yo,uo,po,ho)z = 0 for all 2 G (8) 

DuC{yo,uo,po,ho){u -uq) >0 for all u eU. (9) 

In a standard way, conditions (8) and (9) yield the adjoint equation and 
the pointwise stationarity of the Hamiltonian: 

-Apo(a;) +Po(®) = Dyip{yo(x),ho), in fl, 

d„po{x)= Dyn{yo{x),uo{x),po{x),ho), on T, 



( 10 ) 
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Du'H{yo{x),uo{x),po{x),ho){u -uo{x)) > 0 
for all u € [mi, m 2 ] and a.a. a; 6 F. 



( 11 ) 



Conditions (10) and (11) together with the state equation (3) constitute 
the optimality system for (O/ig). It will be convenient to rewrite this 
optimality system in the form of a generalized equation. To do that, 
define the spaces 

X* := IF^-«( 0 ) X L^{F) x 

A" = L*( 0 ) X L"(F) X L^{Cl) x T*(r) x L"(F), s 6 [2, 00] ^ ’ 



and the following set-valued mapping with closed graph: 

^r^ \ ^ f {-^°°(r) I Ir Hv - u) dSx < 0 Vu G W} ifu£U, 

(13) 

Denote ^ = {y^u^p) G X^. Let the function x G A^, as 

well as the multivalued mapping T : X^ -> be defined as follows 





' -Ay-t-y 


in Q 




{0} 1 




duV - b{y, u, h) 


on r 




{0} 




-Ilp + p- Dyp{y,h) 


in 


, r[0 = 


{0} 




d,,p- Dyn{y,u,p, h) 


on r 




{0} 




. Dy,'H(y,u,p,h) 


on r 




M{u) J 



(14) 

Then the optimality system (3), (10), (11) for (Oho) can be expressed in 
the form of the following generalized equation: 



0G^(6,M + r(6). 



(15) 



3. Application of abstract theorems for 
generalized equations 

We are going to investigate conditions under which there exists a 
neighborhood Go of ho such that, for each h G Go, the generalized 
equation 

0GX(C,h)+r(O (16) 

has a locally unique solution = (yh^Uh^Ph)^ which is Bouligand dif- 
ferentiable function of h. We will follow the same scheme as in [4, 5]. 
Namely the proof will be in two steps. First, we show existence, lo- 
cal uniqueness and Lipschitz continuity of the solutions to (16). In the 
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second step, we use these properties, to show differentiability of the solu- 
tions. In both steps we need the following auxiliary generalized equation, 
obtained from (16) by linearization of at the reference solution 

and by perturbation: 

€ H^o,ho)+D^r{^o,ho){C-^o) + Ti(:), (17) 

where 6 G is the perturbation. Clearly, for ^ = 0, is a solution to 
(17). We will denote by (xq) := {x E X \ \\x — x^)\\x < p} the closed 
ball of radius p centered at xq in a Banach space X. 

The following Robinson’s implicit function theorem (see. Theorem 2.1 
and Corollary 2.2 in [7]) allows to deduce existence and local Lipschitz 
continuity of the solutions to the nonlinear generalized equation (16), 
from the same properties of the solutions to the linearized equation (17). 

Theorem 1 If there exist pi > 0 and P 2 > 0 such that^ for each S G 
there is a unique solution ^^(^o) of (17) ^ which is Lipschitz 
continuous in 5, then there exist di > 0 and (72 > 0 such that, for each 
h G B^^{ho) there is a unique solution in B^(^o) of (16), which is 
Lipschitz continuous in h. 

Similarly, the following theorem due to Dontchev (see. Theorem 2.4 and 
Remark 2.6 in [2]) allows to reduce differentiability analysis for the so- 
lutions to (16) to the same analysis for the solutions to (17). 

Theorem 2 If the assumptions of Theorem 1 are satisfied and, in addi- 
tion, the solutions Q G B^(^o) of (17) are Bouligand differentiable func- 
tions of 8 in a neighborhood of the origin, with the differential {D§(^Q]r]), 
then the solutions of (16) are Bouligand differentiable in a neighbor- 
hood of ho. For a direction g £ H, the differential at ho is given by 

{Dhio\ g) = {DsQo\ -DhF{f^o, ho)g). (18) 



Remark 1 In Theorem 1, Lipschitz continuity of and ^ is understood 
in the sense of that norm in the space X, in which ^(-, h) is differentiable. 
On the other hand. Theorem 2 remains true, if B-differentiability of Q 
is satisfied in a norm in the image space X weaker than that in which 
Lipschitz continuity in Theorem 1 holds (see. Remark 2.11 in [2]); e.g., 
in Z/^, (5 < (X)), rather than in L^. This property will be used in Section 
4. 



In order to apply Theorems 1 and 2 to (0/^), we have to find the form 
of the linearization (17) of the optimality system (16), for T and T given 
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in (14). To simplify notation, the functions evaluated at the reference 
point will be denoted by subscript ”0”, e.g., (pa := ^(yo)^o)) “ 

'H{yo,uo,po,ho). Moreover, we denote := {yo,uo,po)- 

Let S = (5^, 5^) G A°° be a vector of perturbations. By 
simple calculations we obtain the following form of (17): 



IT Hi -Az + z — e^ + S^, 1 

d^z - Dyhoz = e2 + + Dy})ov, I 



—Aq + q — + D^yPoz, 

d^q - Dyboq = + 5'^ + DyyUoz + D'^^'Hov, 

DlyHoz + Dljiov + Duboq -e^-S^G -M{v), (21) 




where e = (e^, e^, e^, e^, e®) € A°° is a given vector. 

Note that 

(2^o,wo,yo) = (yo,wo,Po) (22) 

is a solution to (LO^) for 5 = 0. An inspection shows that (LO^) can 
be treated as an optimality system for the following linear-quadratic 
accessory problem: 



(LP(^) Find (zs^vs) G such that 
I{zs, vs, S) = minl(z, v, S) 
subject to 

—Az{x) + z{x) = S^{x) in 

djyz{x) = Dybo{x)z{x) + Dubo{x)v{x) (23) 

+e‘^{x) + 5‘^{x) on F, 

V eu, 

where 

X{z,v^6) := ^{{z,v)^D‘^Co{z,v)) + J^{e^ + S^)zdx+ 

+ "b 5^^z + (e^ + 5^)v]dSx, 



with the quadratic form 

{{z,v),D^Co(z,v)) := f Dlyp{yo,ho)z'^dx 
j o 



+ 



/r'"" 



m 

D^yUo Dpio 
DlyHo 





z 




V 



(24) 



dS^. 



To verify assumptions of Theorems 1 and 2, we have to show that 
there exist constants pi,p2 > 0 such that for each 6 G Bjf)°°(0) there is a 
unique stationary point Q '■= {^ 5 ,vs,qs) in B^°°(^o) of (LP5), which is 
a Lipschitz continuous and Bouligand differentiable function of S. 
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4. Differentiability of solutions to accessory 
problems 

As in [4], the starting point in the proof of differentiability of the 
solutions to (LP^) is the Lipschitz continuity property for these solutions. 
To this end, we will need a coercivity assumption (see, [ 6 ]). Let us define 
the sets of those points, at which the reference control is active: 

I = {a; € r I uo{x) = mi}, J = {a: € P | uo(a:) = m 2 }. (25) 

Moreover, for any a > 0 define the sets 

/“ = {a; G P I Du'H{yo,uo,po, ho){x) > a}, 

J" = {a; e r I - DuPL{yo,uo,po,ho){x) > a}. 

As in [ 6 ], we assume: 

(AC) (coercivity) There exist a > 0 and 7 > 0 such that 

{{z,v),D‘^Co{z,v)) >7ll^lli2(r) 

for all pairs (z, v) satisfying 

—Az{x) + z{x) =0 in O, 

duz{x) - Dyb{){x)z{x) - DubQ[x)v{x) =0 in F, 

and such that v G {L^(r) | v{x) — 0 for a.a. a; G U 

Note that (AC) implies the following pointwise coercivity condition 
(see, e.g., Lemma 5.1 in [10]). 

Dlun^{x) > 7 for a.a. x G F \ (^ U J^). (28) 

By a slight modification of Satz 18 in [11] we get the following Lips- 
chitz continuity result for (LPj): 

Proposition 1 If (AC) holds, then there exist constants pi > 0 and 
P 2 > 0 such that, for all 6 G (0) there is a unique stationary point 
Cd '= of (LPs)- Moreover, there exists a constant 

£ > 0 such that 

for all 6', 5" G B^^ {0) and all s G [2, 00 ] 



(26) 



(27) 



( 29 ) 
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The proof of B-differentiability of the stationary points of (LP<^) is in 
two steps. In the first step, directional differentiability is proved and the 
directional differential is characterized. This characterization is used in 
the second step to show that the differential is actually Bouligand. Let 
us start with the directional differentiability. The proof of the following 
result is very similar to that of Proposition 4.3 in [5]. 

Proposition 2 Let (A1)-(A3) as well as (AC) be satisfied and let 
PItP2 >0 be as in Proposition 1. Then the mapping 

Cs ■= {zs,vs,qs) ■■ 

where Q G denote a unique stationary point of (LP^^), is direc- 

tionally differentiable. The directional differential at 5 = Q, in a direc- 
tion r] G is given by where is the solution and 

rrj the associated adjoint state of the following linear- quadratic optimal 
control problem: 



(LQ,) 



Find {wrj^Wr^) G x T^(P) that minimizes 



Jr,{w,w) = ^{{w,w),D‘^Co{w,w)) + 


J rj^vo dx 




+ J {rj'^w + rfw) dSx 


n 


(30) 


r 

subject to 






— /S.W VO = 


in 


(31) 




d^w = Dybovo + DJjqw + rf 


on r. 


w{x) < 


' = 0 forxe{I^U J°), 

>0 forxe{I\I°), 
<0 for X e {J\ J°), 

, free for a; G F \ (/ U J). 




(32) 



Note that, by the same argument as in Proposition 1, we find that 
the stationary points of (LQ^^) are Lipschitz continuous functions of rj. 
Since (wo^wo^ro) = (0,0,0), we have 



lk77llL^(r). \\rr)\\w^,s^Q) <i\\r]\\As, 5G[2,oo]. (33) 



We are now going to show that {wjj^Wrj) and are actually B- 
differentials at 0 of (zs^vs) and qs^ respectively. 

Theorem 3 Let (A1)-(A3) as well as (AC) be satisfied and let pi, P2 > 
0 be as in Proposition 1. Then the mapping 

Cs — {zs,V5,qs) ■■ -)■ X®, 



(W 
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where Q ^ denote a unique stationary point of (LP^^)^ is B- 

differentiable for any s G [2,oo). The B-differential at S = G in a direc- 
tion T] e A is given by \= {vOr^^w>q^rr^), where is the solution 

and Vn the associated adjoint state of problem (LQ^). 

Proof The optimality system for (LQ^^) takes the form: 

—Azu + tu — ? 7 ^, 1 

di,zu - Dybow = ri‘^ + Dubow, j 

-Ar + r = T]^ + D^yifow, 
dyV - Dybor = r/^ + DyyUow + D'^^'Hqw. 

{DlyTio ta + Dl^Uow - Dubo r - r)^,v - w) >0 
for all V E L^(Q) satisfying (32). 

We have to show that the solution of (35)-(37) are B- 

differentials of the solution to (LO^). Clearly, {wy,Wy,ry) is a positively 
homogeneous function of 77 , so, by Definition 1, it is enough to show that 

Z'q = ZQ + '€ay + ai{r}), Vy = vq + Wy + a2{ri), qr) = Qo + + ai{r]), 





„here ^ 0, 

for any s G [ 2 , oo). 



Denote 






9 -s ||??||a°° 0 ) 



(38) 



(z^ - ^o) = (v^ - ^^o) = Wt)^ {qri - qo) -- Dj- (39) 

It follows from (19) and (20) that (ro^, Wrj, f^) satisfies equations identical 
with (35) and (36): 



—Aw + W = Tj^, \ 

d„w - Dybow = 77 ^ + Dubow, J 

-Ar + r = r]^ + D^ycpow, 

duT - Dybor = 77“^ + Dyy'Ho^ + D^^Tiow. 



(40) 

(41) 



To characterize {wrj,Wy,ry), we still need a condition analogous to 
(37). To this end, let us choose ^ E (0, a), where a is given in (AC). 
Define the sets 

iff = {x G /o I Dl^Hoix) E (0,13)}, 

K^ = {xEJ^ I -D 2 „?^o(:r)G( 0 ,^)}, 

= {a; G r I 7io(a:) G (7771,7711 + / 3 )U (m2 — , 0 , 7712)}- 



( 42 ) 
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Note that 

meas {K^ U iif| U L^) ^ 0 as 0. (43) 

Let us split up the set V into the following subsets 

^ = r\(7UJU L^), B = {I^\ k() U (JO \ K^), 

C = (J\/°)U(J\ J°), V = K(yjK^iM^. 

We will analyze conditions analogous to (37) on each of these subsets 
successively. 

Subset A Choose ^(/3) = Then by (22) and (25), as well as by 

Proposition 1, for all r] G ^^J)(0) we get 

Vrj{x) G (mi, m 2 ) for a.a. x e (44) 

i.e., by (21) 

DlyHoix) Zr^{x) + DlJio{x)vr,{x) + L»„6o(a:, t) qr,{x, t) 

—e^(x) — rf^(x) = 0 for a.a. x E A. 



Subtracting from (45) the analogous equation for (zo,vo,qo) and using 
notation (39), we obtain 

DlyV-dix) Wr,{x) + Dl^noix)wr,{x) - Duao{x) r^{x) 

—rf{x) = 0 for a.a. x E A. 

Subset B It follows from Proposition 1 that, shrinking g{^) > 0 if nec- 
essary, for all T] E J) (0) we obtain 

DlyHo{x) Zy{x) + DlyHo{x)vy{x) + Dubo{x) qj,{x) 



—e^{x) — r]^{x) 



> 0 
<0 



for a.a. x E \ iff, 
for a.a. xEJ^\K^, 



which, by (21) implies 



Vr,{x) = 



mi{x) 

m2{x) 



for a.a. x E I^\ iff, 
for a.a. x E \ iff) 



(47) 



i.e.. 



Wrj{x) = 0 

Subset C By (22) and (25) we have 



for a.a. x E B. 



Vq{x) = Uo{x) - 



mi{x) 

m2{x) 



for a.a. x E I \ I^, 
for a.a. x E J \ 



(48) 

(49) 
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DlyUoix) zq{x) + DI^%q{x)vq{x) + Dubo{x) qo{x) = 0 
for a.a. x£ (/ \ 7°) U (J \ J“). 



Proposition 1, together with (49) implies that, shrinking g{/3) if neces- 
sary, for any r] € we get 



V 



V 




[mi, m 2 ) 
(mi, m 2 ] 



for a.a. x E I\I^, 
for a.a. x G j\ jo. 



(51) 



Hence, in view of (21) we have 



DlyTioix) Zr,{x) + DlyTloix) Vr,{x) + Dubo{x) qr,{x) 

-r)5tri / - ° ^ S ^ \ 

^ \ < 0 for a.a. x E J \ J^. 



Conditions (49)-(52) imply: 

~ \ f > 0 for a.a. x E I\I^, 

"^vi^) I < 0 for a.a. x E J\J^ ’ 

Dly'Ho{x) Wr,{x) + Dl^Hoix) Wr,(x) + Dubo(x) rr,{x) 



(53) 



f > 0 for a.a. xEl\I^, 
^ L < 0 for a.a. x E J \ J^, 



and 



(Dly'Hoix) Wr,{x) + Dl^noix) Wnix) + Dubo{x) r^(x) 



-q^{x)){w - Wr, 




for all la > 0 
for all la < 0 



on 7 \ 7°, 
onJ\J^. 



(54) 



(55) 



Subset V The analysis of subset V is the most difficult, because we do 
not know a priori if for x G P the constraints are active or not at Vrj^ 
no matter how small r] is chosen. Without this information, we can say 
very few about Wrj{x) = Vrj{x) — vq{x). Let us denote 

{?}^y{x) = Dly%o{x) {Zr,{x) - zo(a:)) + Dly'Hoix) (vr,{x) - vo{x)) 
+Dubo{x) {qr]{x) - qo{x)) for a.a. x eT>. 

(56) 

By definition (39) we have 



Dly'Hoix) zor,{x) + DlyHoix)vr,{x) + Duboix) Triix) 
— {rf')'ix) = 0 for a.a. x E'D. 



( 57 ) 
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Denote rj' = {rj^ ,rj^ , {'n^)')i where 






{ri^)'{x) 

rf{x) 



for X 

otherwise. 



( 58 ) 



It is easy to see that (40) and (41) together with (46), (48), (53)-(55) 
and (57) can be interpreted as an optimality system for the optimal 
control problem (LQ^^/), where (LQ^^) is the following slight modification 
of (LQ,): 



(LQ,) 



Find {wrj^Wr]) ^ x L‘^{T) that minimizes 
Tr^{w^w) subject to 

—Aw{x) + w{x) =rj^{x) infi, 

dyvo + Dyb^vo = Dyh{){x)w{x) + DubQ{x)w{x) + rf' onF, 



r =0 



w{x) 



> 0 
< 0 



t free 



for xe{I^\ K^) U {J^ \ K^), 
for rr G (/ \ /®), 
for X € {J \ J^), 

for X G r \ (7 U J)) U (TCf U K^). 



Similarly can be interpreted as a stationary point of (LQ^//), 

where ry" = with 

^ I («"(-) f” - e O, 

[ r]^{x) otherwise, 

{rj^)"{x) = DlyHoix) TUr,{x) + DlJio{x)wr,{x) + Dubo{x) ry{x). 

(59) 

It can be easily checked that, as in the case of (LQyy), the stationary 
points of (LQy^) are Lipschitz continuous functions of r]. Hence, in view 
of (58) and (59), we have 






1 

s 



< i IIV - 77"||a* = i 



' j \{^)'{x) - {rf')"{x 

K^UK^UL^ 






( 60 ) 
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Using the definitions (56), (59) and taking advantage of (29) and of (33) 
we get 

l(^?")'(^) - (^T(^)I < l(^")'(^)l + l(^')"(^)l 

= \DlyHQix) {Zy{x) - zo{x)) + DlJio{x){vr,{x) - no(a;)) 

+Dubo{x) {qy{x) - 9o(a;))| (61) 

+\D‘ly'H.Q{x) Wr,{x) + D‘lJio{x)vr,{x) + Dyl)(i{x)ry{x)\ 

< c ||??||a°° for a.a. x € -fff U U L^. 

Substituting (61) to (60) we obtain 

\\wn - cc7,,||v^i,s(q), \\wr, - 'w^7?||i^(r)? Ilu; - 

< c||7/||a°° jmeas (iff U U I/^)| ^ . 

In view of (39) and(43), we find that for any e > 0 and any s G [2, oo) 
we can choose P{e, s) > 0 and the corresponding p(/5(e, s)), so small that 

\\zti — zq — ■ci7,,||wi.s(n), \\vn — Vo — 'to,,||x,^(r), Wqu — qo — r77l|wi.s(n) 

< e lkl|A°° for all q G Sjj(e,5))(0)- 

This shows that (38) holds and completes the proof of the theorem. 

Remark 2 The proof of Theorem 3 cannot be repeated for 5 = (X) and 
the counterexample in [4] shows that B-differentiability of (34) cannot 
be expected for 5 = cx). 

5. Differentiability of the solutions to nonlinear 
problems 

By Theorems 2 and 3, for any h in a> neighborhood of /iq, (Oh) has a 
unique stationary point {vh^'^h^Ph)^ which is a B-differentiable function 
of h. On the other hand, by Theorem 3.7 in [6], for h sufficiently close 
to /lo, condition (AC) implies that (ys^us) is a solution to (0/j). Thus, 
we obtain the following principal result of this paper: 

Theorem 4 If (A1)-(A7) and (AC) hold, then there exist constants 
(Ji,(j 2 > 0 such that, for any h G i?^(/io); there is a unique stationary 
point (yh,Uh,Ph) in ,8^°°(^o) of (Oy), where (yh,Uh) is a solution of 
(0/i). The mapping 

{Vh^'^h^Ph) • s G [2, oc) (63) 
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is B- differentiable, and the B-differential evaluated at /iq in a direction 
g G H is given by the solution and adjoint state of the following linear- 
quadratic optimal control problem 

(L^) Find {zg,Vg) G x L‘^{T) that minimizes 

ICg{z,v) = l{{z,v),D‘^Co{z,v)) + f Dlf^ifQQzdx 

n n " ^ 

+f Dyf^Tiogz dSx + f DlhHogv dS^ 

subject to 

—/S.Z + z —Q in Ft, 

dyZ = DyhQ z + Dub{) V + Dhbo g on F, 

and 

' = 0 forxe (/^U J°), 

>0 forxe(I\I^), 

v(x) < 

<0 forxe(J\J^), 
free for x €T\{I U J)). 

As it was noticed in Introduction, Bouligand differential becomes Prechet 
if it is linear. Hence from the form of (L^), we obtain immediately: 

Corollary 1 If meas {I \ I^) — meas ( J \ J^) = 0^ then the mapping 
(63) is Frechet differentiable. 

In sensitivity analysis of optimization problems an important role is 
played by the so-called optimal value function, which on B^^{ho) is de- 
fined by: 

— Th{yh,uh), 

i.e., to each h G 3^^ assigns the (local) optimal value of the 

cost functional. In exactly the same way as in Corollary 5.3 in [5], we 
obtain the following result showing that Bouligand differentiability of 
the solutions implies the second order expansion of JTq, uniform in a 
neighborhood of Hq. 

Corollary 2 If assumptions of Theorem 4 hold, then for each h = ho + 
geBgiho) 

T^{h) =J^^{ho) + {DhCo,g) 

/ / DlyU Dl^Co 

~^2 

V V 

+o{\\g\?H). 



( 64 ) 
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where {zg^Vg) is the B- differential of {yh^Uff) at Hq in the direction g, 
i.e., it is given by the solution to (L^). 
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Abstract The paper deals with shape optimization of dynamic contact problem 
with Coulomb friction for viscoelastic bodies. The mass nonpenetra- 
bility condition is formulated in velocities. The friction coefficient is 
assumed to be bounded. Using material derivative method as well as 
the results concerning the regularity of solution to dynamic variational 
inequality the directional derivative of the cost functional is calculated 
and necessary optimality condition is formulated. 

Keywords: Dynamic unilateral problem, shape optimization, sensitivity analysis, 
necessary optimality condition 

1. Introduction 



This paper deals with formulation of a necessary optimality condition 
for a shape optimization problem of a viscoelastic body in unilateral dy- 
namic contact with a rigid foundation. It is assumed that the contact 
with given friction, described by Coulomb law [2], occurs at a portion 
of the boundary of the body. The contact condition is described in 
velocities. This first order approximation seems to be physically real- 
istic for the case of small distance between the body and the obstacle 
and for small time intervals. The friction coefficient is assumed to be 
bounded. The equilibrium state of this contact problem is described by 
an hyperbolic variational inequality of the second order [2, 3, 5, 7, 17]. 

The shape optimization problem for the elastic body in contact con- 
sists in finding, in a contact region, such shape of the boundary of the 
domain occupied by the body that the normal contact stress is mini- 
mized. It is assumed that the volume of the body is constant. 
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Shape optimization of static contact problems was considered, among 
others, in [3, 8, 9, 10, 11, 16]. In [3, 8] the existence of optimal solutions 
and convergence of finite-dimensional approximation was shown. In 
[9, 10, 11, 16] necessary optimality conditions were formulated using the 
material derivative approach (see [16]). Numerical results are reported 
in [3, 11]. 

In this paper we shall study this shape optimization problem for a vis- 
coelastic body in unilateral dynamical contact. The essential difficulty to 
deal with the shape optimization problem for dynamic contact problem 
is regularity of solutions to the state system. Assuming small friction 
coefficient and suitable regularity of data it can be shown [6, 7] that the 
solution to dynamic contact problem is enough regular to differentiate it 
with respect to parameter. Using the material derivative method [16] as 
well as the results of regularity of solutions to the dynamic variational 
inequality [6, 7] we calculate the directional derivative of the cost func- 
tional and we formulate necessary optimality condition for this problem. 
The present paper extends the authors’ results contained in [12]. 

We shall use the following notation : G i?^ will denote the bounded 

domain with Lipschitz continuous boundary T. The time variable will 
be denoted by t and the time interval J == (0, T), T >0. By k G 

(0, oo) we will denote the Sobolev space of functions having derivatives in 
all directions of the order k belonging to [1]. For an interval I and 

a Banach space B LP{I] B), p G (1, oc) denotes the usual Bochner space 
[2]. ut = du/dt and uu = d?u/dt^ will denote first and second order 
derivatives, respectively, with respect to t of function u. utN and utT 
will denote normal and tangential components, respectively, of function 
Ut. Q = / X fi, = 7 X r^, 2 == 1, 2, 3 where F^ are pieces of the boundary 
F. 

2. Contact problem formulation 

Consider deformations of an elastic body occupying domain G 72^. 
The boundary F of domain 0 is Lipschitz continuous. The body is 
subjected to body forces / = (/i,/2)- Moreover surface tractions p = 
(pi,P2) are applied to a portion Fi of the boundary F. We assume that 
the body is clamped along the portion Fq of the boundary F and that 
the contact conditions are prescribed on the portion F2 of the boundary 
F. Moreover F^- fl Fj = 0, i 7^ j, i^j == 0, 1, 2, F = Fq U Fi U F2. 

We denote hy u = (i^i, 1^2)7 u = u(t, rr), a; G fi, t G [0,7^, T > 0 the 
displacement of the body and by cr = {aij{u{t^x))}^i^j = 1,2, the stress 
field in the body. We shall consider elastic bodies obeying Hooke’s law 
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[2, 3, 5, 17] : 

aij{u) = c^ijklix)eki{u) + cljki{x)eki{ut) a; € O, eki = ^{uk,i + ui^k) (1) 

= 1,2, = dukjdxi. We use here the summation convention 

over repeated indices [2]. c^jki(x) and c}jj^i{x), = 1,2 are com- 

ponents of Hooke’s tensor. It is assumed that elasticity coefficients 
and satisfy usual symmetry, boundedness and ellipticity conditions 
[2, 3, 5]. In an equilibrium state a stress field a satisfies the system 
[2, 3, 6, 7] : 

~~ ~ (^5^) ^ (O5 ^ ^ h j ~ (^) 

where aij{x)j = daij{x)/dxj^ i^j = 1,2. There are given the following 
boundary conditions : 

Ui{x) = 0 on (0, T) X To i = 1, 2, 

aij{x)rij = Pi on (0,T) xTi i,j = 2] (3) 

utN ^0^ aN < 0, utN(^N — 0, on (0,T) x F2; (4) 

uiT = 0 I ctt |< ^ I cttv I; 

utT 0 ^ ar = -T | aw | -j — (5) 

I UtT I 

Here we denote : un — aw = dijUiUj^ (^r)i = Ui — ujsfUi^ (ctt)z = 
(TijUj — (7^711 i^j = 1,2, n == (^1,^2) is the unit outward vector to the 
boundary F. There are given the following initial conditions: 

Ui{0^x)=uo uti{0^x)=ui^ i = l,2, X e (6) 

We shall consider problem (2)-(6) in the variational form. Let us 
assume, 

/ € {H^n-,R^)r)nLHQ-,R^), 

peL\l; {H^^^{Ti-,R^)r), 

uo € H^I‘^{^]R?) ui G H^/^{n-,R^), u^r, = 0, (7) 

T G L°°(r2;i?^) is continuous for a.e. x ET 2 

be given. The space L‘^{Q;R?) and the Sobolev spaces 
i7V4(j. ag ^g defined in [1, 2], 

Let us introduce : 

F^{zeL'^{I-,H^{n-R^)) : Zi^O on (0, T) X Tq , i = 1, 2} (8) 
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K = [z e F : ztN <0 on (0, T) x F 2 }. (9) 



The problem (1) - (6) is equivalent to the following variational problem 
[6, 7]: find u € T°°(7; B?))f\K such that ut G 

L°°(7; L2(0; i?2)) nifV2(/. i 2 (^. and uu 6 B?)) 

n(i7^/^(J; satisfying the following inequality [6, 7], 



/ uttidxdr + / (Tij{u)eij{vi - uti)dxdr+ 

J Q J Q 

/ F I on{u) I (I Ut I - I utT \)dxdT > / fi{vi - uu)dxdT+ 
Jj2 Jq 

I Pi{vi - uti)dxdr 'dv e H^^^{I-,H\n;R^))nK. 



( 10 ) 



Note, that from (2) as well as from Imbedding Theorem of Sobolev spaces 
[1] it follows that uq and u\ in (6) are continuous on the boundary of 
cylinder Q. The existence of solutions to system (1) - (6) was shown in 
[6, 7]: 

Theorem 2.1 Assume : (i) The data are smooth enough, i.e. (2) is 
satisfied, (ii) T 2 is of class (Hi) The friction coefficient is small 

enough. Then there exists a unique weak solution to the problem (1) - 

( 6 ). 

Proof. The proof is based on penalization of the inequality (10), friction 
regularization and employment of localization and shifting technique due 
to Lions and Magenes. For details of the proof see [7]. 



□ 



For the sake of brevity we shall consider the contact problem with 
prescribed friction, i.e., we shall assume 

T I (Jat gt ^ 1* (11) 

The condition (4) is replaced by the following one, 

UtTO'T+ I UtT |— O 5 I 1^ 1 on / X F 2 . (12) 

Let us introduce the space 

A = {A € L°^{T 2 )) : I a |< 1 on 7 x r 2 }. (13) 

Taking into account (12) the system (10) takes the form : Find u G K 
and A € A such that 

/ Uttidxdr + / aij{u)eij{vi - uti)dxdr - / Xt{vt - utT)dxdr 
J Q J Q Jj2 
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> / fi{vi-uti)dxdT+ / pi{vi - uti)dxdT 
JQ hi 

/ GTUtTdsdr < / \TUtTdsdr \/Xt G A. 

y 72 y 72 



(14) 



(15) 



3. Formulation of the shape optimization 
problem 

We consider a family of the domains Qg depending on parameter 

s. For each ftg we formulate a variational problem corresponding to (10). 
In this way we obtain a family of the variational problems depending on 
s and for this family we shall study a shape optimization problem , 
i.e., we minimize with respect to 5 a cost functional associated with the 
solutions to (10). 

The domain we shall consider as an image of a reference domain 
ft under a smooth mapping T^. To describe the transformation we 
shall use the speed method [16]. Let us denote by 1^(5, x) an enough 
regular vector field depending on parameter 5 G [0, ^), > 0 : 

V{.,.) : [O,??) xE? 

V{s,.) €C^{R^,R^) VsG[0,i 9), y(.,a;) G ^xeB?. 

(16) 

Let T 5 (y) denotes the family of mappings : Tj(y) : B? 3 X ^ 
x{t,X) G B^ where the vector function x{.,X) = x{.) satisfies the sys- 
terns of ordinary differential equations : 

^x{t, X) = V(t, x{t, X)),t e [0, d), *(0, X)=X eB. (17) 
dr 

We denote by DTg the Jacobian of the mapping Tg(F) at a point X G 
B?. We denote by and the inverse and the transposed 

inverse of the Jacobian DTg^ respectively. Jg == detjDT^ will denote 
the determinant of the Jacobian DTg. The family of domains {fig} 
depending on parameter 5 G [0, ??), > 0, is defined as follows : fto = ft 

ftg = Tg{ft){V) = {xeR^ : 3X eR^ s. th. - x{s,X), 

where the function x{.^X) satisfies (17) for 0 < r < s}. (18) 

Let us consider problem (14) - (15) in the domain ftg. Let Fg, Kg^ Ag 
be defined, respectively, by (8, (9), (13) with ftg instead of ft. We shall 
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write Ug = u{flg), Og = The problem (14) - (15) in the domain 

takes the form : find Ug € Kg and Ag G Ag such that, 



/ uttsiVidxdT+ aij{ug)eij{vi - utsi)dxdT - 
•^Qs Q s 

/ Ast(ut - utsT)dxdr > / fi{vi - Utgi)dxdr + 

•^'752 Qs 

[ Piivi - utgi)dxdT yv€H^^^{I;H\ng-,R^))nK 



(19) 



/ asTUtgTdsdr < / XgTUtsTdsdr VA^t € A* 

*^75 2 ''ls2 



( 20 ) 



We are ready to formulate the optimization problem. By Cl C we 
denote a domain such that C 0 for all 5 G [0, i?), > 0. Let (j) E M 

be a given function. The set M is determined by : 

M = {cf>e < 0 on / X O, II 0 IUo.(,.^2(n;«2) < 1} 

( 21 ) 



Let us introduce, for given 0 G M, the following cost functional : 

J(j){^s) ~ I ^ sN 4^tN ^ ( 2 ^) 

Jls2 

where (ptNs and Gsn are normal components of (j)ts and cJs, respectively, 
depending on parameter 5. Note, that the cost functional (22) approx- 
imates the normal contact stress [3, 8, 11]. We shall consider such a 
family of domains {Os} that every O5, 5 G [0,7?), 7? > 0, has constant 
volume c > 0, i.e. : every O5 belongs to the constraint set U given by : 

U = {Vls : [ dx = c}. (23) 

J Os 

We shall consider the following shape optimization problem : 

For given 0 G M, find the boundary F2s 
of the domain Og occupied by the body, (24) 

minimizing the cost functional (22) subject to fig ^ U. 

The set U given by (23) is assumed to be nonempty. (7/5, Xg) ^ Kg x Ag 
satisfy (19) - (20). Note, that the goal of the shape optimization problem 
(24) is to find such boundary F2 of the domain occupied by the body 
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that the normal contact stress is minimized. Remark, that the cost 
functional (22) can be written in the following form [3, 17] : 



/ asN(f>tNdsdT = / utts<l>tsdxdT+ asij{us)eki{(j)ts)dxdT - (25) 

•^725 Qs '^Qs 

/ f^tsdxdr - / Pa4)tsdsdT - / asTfptTsdsdr. 

J Oc, J « 



We shall assume there exists at least one solution to the optimization 
problem (24). It implies a compactness assumption of the set (23) in 
suitable topology. For detailed discussion concerning the conditions as- 
suring the existence of optimal solutions see [3, 16]. 



4. Shape derivatives of contact problem 
solution 

In order to calculate the Euler derivative (44) of the cost functional 
(22) we have to determine shape derivatives {u'^ A') G F x A of a solution 
(us^ As) e Ks X As of the system (19)-(20). Let us recall from [16] : 

Definition 4.1 The shape derivative u' E F of the function Ug G Fg is 
determined by : 

{ug)^Q = u + su' + 0 ( 5 ), (26) 

where || 0 ( 5 ) \\f /s —> 0 for s 0, u = uq E F, Ug E F{B?) is 
an extension of the function Ug E Fg into the space F{B?). F{B?) is 
defined by (8) with B? instead of Ft. 

In order to calculate shape derivatives (ia', A') G F x A of a solution 
{ug^Xg) E Kg X Ag of the system (19), (20) first we calculate material 
derivatives (^i. A) G F x A of the solution {ug^Xg) E Kg x Ag to the 
system (19), (20). Let us recall the notion of the material derivative [16]: 



Definition 4.2 The material derivative u E F of the function Ug E Kg 
at a point X E Ct is determined by : 

lim II [(lis o Tg) — a]/s — u ||ir= 0, (27) 

where u E K , Ug o’Tg ^ K is an image of function Ug E Kg in the space 
F under the mapping Tg. 

Taking into account Definition 4.2 we can calculate material deriva- 
tives of a solution to the system (19), (20) : 

Lemma 4.1 The material derivatives (fi. A) E Ki x A of a solution 
{ug^Xg) E Kg X Ag to the system (19)-(20) are determined as a unique 
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solution to the following system : 

f {{uttV + uttV + uttrjdivV (0)(DF + utt{DV {0 )t]) 

JQ 

-fn - fV + icrij{u)ekiirj) - fi])divV {0)}dxdT - ( 28 ) 

/ {PV +PP + ppD)dxdr - / {(Ar^tr + >^PtT + 

7 71 J72 

Wr]TV{Q)n + Xr]tTD}dxdT >0 V77 G Ki, 

/ {X- ji)utT-\-{X- [i)utT + {X- ^l)utT + ^utTD}dxdT V/i G Li, ( 29 ) 

J72 

where V(0) — V(0,X), DV(0) denotes the Jacobian matrix of the matrix 
V(0). Moreover : 

K\ = {^ E F : ^ — u — DVu on 70 ; > nDV{0)u on Ai, 

= nDV{0)u on A 2 }, (30) 



Af) = {x ^^2 : utN == 0 }, Ai = {x e B \ (tn ^ 0 }, 

A 2 = {x e B : gn < 0}, (31) 



50 == {(T G 72 : Xt = 1, utT 7^ 0}, 

51 = {a; G 72 : At == -1, utr — 0}, (32) 

S 2 = {a; G 72 : At == 1, : uit = 0}. 

Li = G A : ^ > 0 072 B 2 , ^ < 0 on Si; ^ = 0 072 So } (33) 

and D is given by 

D = div V{Q) - (SF(0)n,n). (34) 



Proof: It is based on approach proposed in [16]. First we transport 
the system (19)-(20) to the fixed domain fi. Let == o G F, 
n = no G F, A*^ == A5 o T5 G A, A = Ao G A. Since in general 
n^ ^ K{fl) we introduce a new variable = ST~^n^ G K. More- 
over z = u — DV{0)u [7, 15]. Using this new variable as well as the 
formulae for transformation of the function and its gradient into refer- 
ence domain Q [15, 16] we write the system (19)-(20) in the reference 
domain O. Using the estimates on time derivative of function u [7] the 
Lipschitz continuity of n and A satisfying (19) - (20) with respect to s 
can be proved. Applying to this system the result concerning the differ- 
entiability of solutions to variational inequality [15, 16] we obtain that 
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the material derivative {u^X) ^ K\ x A satisfies the system (28)-(29). 
Moreover from the ellipticity condition of the elasticity coefficients by 
a standard argument [15] it follows that ('u, A) G x A is a unique 
solution to the system (28)-(29). 

□ 

Recall [16], that if the shape derivative u' E F of the function Us G Fg 
exists, then the following condition holds : 

u' = u-VuV{0), (35) 

where u £ F is material derivative of the function Ug ^ Fg. 

From regularity result in [7] it follows that : 

Vt^^(O) G F, vAt^(O) G a, (36) 

where the spaces F and A are determined by ( 8 ) and (13) respectively. 

Integrating by parts system (28), (29) and taking into account (35), (36) 
we obtain the similar system to (28), (29) determining the shape deriva- 
tive (i/', A^) G F X L of the solution {ug, Xgp) E Kg x Lg of the system 
(19) - (20) : 

f u[^r] + uttr}' + {DV (0) +* DV {Q))uur}dxdT + f uurjV (0)n 
Jq Jj 

/ aij{u')eklT] - / X'rjtT + Xr][j^}dxdT + 

Jq J72 

h{ut,v) + h{X,u,rj) >0 V 77 € iVi, (37) 



— A) — utrX'jdxdT + 73 ( 14 , /i — A) > 0 G Li, (38) 
Ni = {r]eF : t] = X - DuV{0), X e Ki}, (39) 



hi^P, (f>)= Wij{p)ekl^ - f(i>- 

J-y 

{{Vpn)(j) + {pV(j))n + p(j)H)V {0)n}dxdT, 
hip,P,(f>)=[ {{Vp)nV(f) + p{'V{V(pn))(p-\- 

'>'72 

liS/ipirH + jiV ipn}V {Qi)ndxdr^ 

Is{(p, /i - A) — / {(pn){fj, - A) + (p{V/j.n) - (p{VXn) -f 
J 72 

(p{li — X)H]V{ff)ndxdr^ 

where H denotes a mean curvature of the boundary F [16]. 



(40) 



(41) 



( 42 ) 
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5. Necessary optimality condition 

Our goal is to calculate the directional derivative of the cost functional 
(22) with respect to the parameter s. We will use this derivative to 
formulate necessary optimality condition for the optimization problem 
(24). First, let us recall from [16] the notion of Euler derivative of the 
cost functional depending on domain : 

Definition 5.1 . Euler derivative dJ(Vt\V) of the cost functional J at 
a point Ct in the direction of the vector field V is given by : 

dJ{n; V) = limsup[J(fi,) - J{ft)]/s. (43) 

5 — >-0 



The form of the directional derivative dJ(p{u]V) of the cost functional 
(22) is given in : 



Lemma 5.1 The directional derivative dJ(j){u;V) of the cost functional 
(22), for (f e M given, at a point u £ K in the direction of vector field 
V is determined by : 



dJ^{u- V)= [ [u'ttTj + Uttrj’ + {DV{0) +* DV{0))uttv]dxdr 
JQ 

J uttvV{0)n + J^{aijeki{(f>)da 

- f<P)V{Q)nds - f ivp<j>V{0) 

p\/ (l)V{0) +p(l)D)ds - [ a!j^(l)Tds + Ii{u,(l)) - l 2 {X,u,(j)), (44) 

JVo 



+ 
ix + 
+ 



where a' is a shape derivative of the function ag with respect to s. This 
derivative is defined by (26). \/p is a gradient of function p with respect 
to X. Moreover V(0) = V(0,X), (j)T and gt are tangent components of 
functions (j) and a, respectively, as well as D is given by (34)- DV(0) de- 
notes the Jacobian matrix of the matrix V(0) and div denotes divergence 
operator. 



Proof : Taking into account (22), (25) as well as formulae for transfor- 
mation of the gradient of the function defined on domain Vtg into the 
reference domain fi [16] and using the mapping (16)- (17) we can ex- 
press the cost functional (22) defined on domain ilg in the form of the 
functional J(j){u^) defined on domain O, determined by : 



- [ {DT 
Jq 



jU 



ttDT sffldetDT gdxdr + 
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DTseki{DTs<i>l) - rDTs4>)^^WTsdx 




II detZ?Ts*Z)Tj || ds — 
\stDT! g(j)'p II detZ^T^^ZJTj || ds, 



(45) 



where u^ = UsoTsEF,u = uo£F and A = Aq € A. By (43) we have : 



dJ^{u; V) = limsup[J,^(u^) - J^{u)]/s. (46) 

t — >^0 

Remark, it follows by standard arguments [3] that the pair G 

Qs X Ks^ s G [O,-??), -i? > 0, satisfying the system (19)-(20) is Lipschitz 
continuous with respect to the parameter s. Passing to the limit with 
5 -> 0 in (46) as well as taking into account the formulae for derivatives 
of and deti^T^ with respect to the parameter s [16] and (26) we 

obtain (44). 



□ 



In order to eliminate the shape derivative (u', A') from (44) we intro- 
duce an adjoint state (r, q) e K 2 x L 2 defined as follows : 



with 



/ rttCdxdr+ / (Jij{C)eki{(l) + r)dxdr + 
JQ Jq 

[ Ctriq - XKdxdr = 0 e K 2 , 

r(T,x) = 0, rt{T,x) == 0, 

/ {nr + ^tT - utT)Sdxdr = 0 \/S e L 2 , 

K 2 = {c e Ki : (n = 0 on Aq}, 



(47) 



(48) 

(49) 



L 2 = {S e A : 6 = 0 on ^0 n 5o }. (50) 

Since 0 G M is a given element, then by the same arguments as used to 
show the existence of solution (u^X) e K x L to the system (19)-(20) 
we can show the existence of the solution (r, q) e K 2 x L 2 to the system 
(47), (48). 

From (44), (37), (38), (47), (48) we obtain : 

dJ(f){u; V) = Ii{u, 0 -f r) -f hiX.u, 0 + r) -h h{u,q - A). (51) 
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The necessary optimality condition has a standard form : 

Theorem 5.1 There exists a Lagrange multiplier /a E R such that for 
all vector fields V determined by (16), (17) the following condition holds 

dJ4u-,V) + nfv{0)nds>0, (52) 

where dJ^{u]V) is given by (51). 

Proof : It is given in [3, 4, 5, 16, 17]. 

6. Conclusions 

In the paper the necessary optimality condition for the shape opti- 
mization problem for the dynamical contact problem was formulated. 
Preliminary numerical results can be found in [13] where the continuous 
optimization problem was discretized by piecewise linear and piecewise 
constant functions on each finite element. The discretized problem was 
numerically solved by an Augmented Lagrangian Algorithm combined 
with active set strategy and updating of the dual variables. 
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Abstract In this paper, we consider two approaches to solving an optimization 
based design problem where “shape” is the design parameter. Both 
methods use domain transformations to compute gradients. However, 
they differ in that the second method is based on solving a transformed 
optimization problem completely in the computational domain. We il- 
lustrate the methods using a simple ID problem and discuss the benefits 
and drawbacks of each approach. 

Keywords: Continuous Sensitivity Equation Methods, Optimal Design 

1. Introduction 

The focus of the paper is an optimal design problem where the de- 
sign parameter determines the shape of the domain of the constraint 
equation. The cost function is given in terms of an integral expression 
describing the L2 difference between some target function and the state 
variable. The constraint equation, or state equation, takes the form of 
an elliptic partial differential equation defined on a parameter dependent 
domain. Under the assumption that each point in the design space deter- 
mines a unique state variable through the solution of the state equation, 
we pose the unconstrained optimal design problem. 
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Since the domain of the constraint equation changes with perturba- 
tions in the design, numerical solution of the optimal design problem 
is often hampered by burdensome grid generation requirements at each 
iteration of an optimization algorithm. One technique that can be used 
to avoid this problem is to transform the domain of the constraint equa- 
tion to one that is fixed and no longer depends on the shape param- 
eter. An equivalent transformed constraint equation is posed on this 
fixed, computational domain, see [4, 8], for example. In this paper, we 
present two approaches to the optimal design problem. Each approach 
uses the transformation technique mentioned above along with CSEMs 
(Continuous Sensitivity Equation Methods) in order to solve the op- 
timal design problem. The main difference between the two methods 
is that one solves the optimal design problem using the parameter de- 
pendent domain of the constraint equation while the second approach 
applies a mapping technique in order to transform both the cost func- 
tion and the constraint equation to a fixed computational space. This 
results in a transformed optimization problem. In each case, gradient 
based optimization is applied, and CSEMs are used to supply gradient 
information. 

One of the major topics of concern for using CSEMs with optimal 
design is the issue of consistent derivatives. Within the optimization 
literature, the assumption is usually made that the gradient information 
is the derivative (with respect to the design parameter) of the numerical 
approximation of the cost function. There is a great deal of concern 
that convergence and robustness are compromised if the derivative ap- 
proximations are computed using techniques which do not account for 
truncation and roundoff errors implicitly contained in the cost function. 
In [1,2], the notion of asymptotically consistent derivatives is introduced, 
and CSEMs, when coupled with a trust region method, are shown to be 
applicable within optimal design algorithms. More precise definitions are 
introduced in Section 5.1. We first pose an example optimal design prob- 
lem, and the computational approaches mentioned above are sketched 
out in the context of this example. Numerical results are shown, and 
we conclude with some general remarks concerning these approaches in 
Section 6. 

2. A ID Optimal Design Problem 

Let Q — [l,+oo) denote the design space, and for q E let Q.q = 
(0, q). Consider the boundary value problem 



d^ 

;g) =f(a;), 



X ^ 



( 1 ) 
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with homogeneous Dirichlet boundary conditions 



w(0) = 0, w(q') = 0. 



( 2 ) 



The forcing function, f : (0, +oo) — >■ R, is the piecewise continuous 
function defined by 



f 0, 0 < aj < 1 

\ —1, 1 < a: < + 00 . 



( 3 ) 



For each q £ Q, (l)-(2) has a unique solution w(- ;q) G H^{0,q) H 
(0) ^)- Thus, we define a cost function F: Q IR by 



1 

F{q) ^ 2 Jo ’’’ 



( 4 ) 



and we focus on the optimal design problem: 



min F{q). (5) 

qeQ 

Observe that the state equation, (1) - (2), is defined on the “physical” 
space and the cost function, F{')^ is defined over a fixed subset of 
this space. For this simple example, q can be interpreted as a “shape” 
parameter in the sense that it determines the length of the interval over 
which the state w(-; g) is defined. 



2.1 Domain Transformations 

For large scale problems where the shape of the domain of the state 
equation is parameter dependent, grid generation often poses a major 
difficulty in the optimal design process. As mentioned earlier, one way 
to overcome this obstacle is to apply a domain transformation from the 
physical space to the fixed, computational space. For the model problem 
discussed in this paper, transforming is clearly very simple. We note that 
determining the domain transformation for any given two-dimensional 
or three-dimensional set can be much more complicated. Moreover, this 
calculation often requires the application of a numerical method. In 
order to focus on the issues related to sensitivity computation and the 
resulting gradient approximations, the application of an algebraic do- 
main mapping to the model problem is justified. 

Here we describe the transformation of the parameter dependent do- 
main [0, q] to the fixed computational domain, [0, 1]. Once this mapping 
is constructed, the transformed state equation is defined accordingly. For 
Of > 0, let = (0, a), and for each fixed g > 1, define the transforma- 
tion M(- ; g): by 



=(q = x. 



( 6 ) 
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Note that the spatial variable on the fixed domain is and we use x 
to denote the spatial variable on The transformations given above 
are used to define the “transformed” functions. Let ^ G fill and ^ > 1, 
and for any function u G Lfo(fi^), define the transformed function u G 
i?o(fii) as follows 



u(C; 9 ) = = u(a:;g). (7) 



It can be shown that for a given value of if w(- ; is a solution 
to the boundary value problem given in (l)-(2), then the corresponding 
function w(- ; g) is a solution to the boundary value problem 

^ 6 ( 0 , 1 ) ( 8 ) 

with boundary conditions 



w(0) = 0, w(l) = 0. (9) 

The forcing function f (^; g) is obtained by using the mapping M and the 
relation = f(M(^,g),g) == f(rr) and has the form 






0, 0 < ^ < i 



( 10 ) 



Henceforth, the boundary value problem (8)- (9) is referred to as the 
transformed state equation^ and it is used in each of the computational 
approaches described in the following sections. 



3. Computational Approach 1 

In this section, we describe one approach for solving the optimal design 
problem in (5). This approach can be described as a “differentiate-then- 
map” scheme. Observe that the gradient of the cost function has the 
form 

d 

V-P(9) = [w(a;; q) + sin (nx)] s{x-, q)dx, (11) 

where the sensitivity is defined as follows 

s{- ;q) = iq)- ( 12 ) 

In order to compute the sensitivity, we use the CSEM approach. We 
derive a sensitivity equation^ an equation for which the sensitivity in 
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(12) is a solution. Formally speaking, this equation is derived by implicit 
“differentiation” of the state equation and boundary conditions in (1)- 
(2). For the model problem considered here, it can be shown that the 
sensitivity equation and associated boundary conditions are given by 

d? 

-^s(a;) =0, X €9.q (13) 

with boundary conditions 

s(0) = 0, s(^) = (14) 

Observe that the normal derivative of w appears in the right bound- 
ary condition in (14). This is typical for shape sensitivity problems, 
and these boundary conditions are tricky to derive correctly for more 
complicated problems. 

Gradient based optimization requires that we numerically approxi- 
mate both the cost function and its gradient for a given value of the 
parameter q. Aside from the implementation of a quadrature rule, each 
iteration of the optimization algorithm involves a numerical calculation 
of both the state and the sensitivity for a given design parameter value. 
The following section describes the numerical scheme employed for these 
computations. 

3.1 State and Sensitivity Calculations 

Here we illustrate the use of the mapping technique discussed in Sec- 
tion 2.1. Both a transformed state equation and a transformed sensi- 
tivity equation are constructed on the computational domain fii. The 
derivation of the transformed state equation is presented in detail in Sec- 
tion 2.1 and is given explicitly in equations (8)-(9). In a similar fashion, 
we define the transformed sensitivity 

=s{M{^,qy,q) =s{x;q), (15) 

and the transformed sensitivity equation is constructed. This boundary 
value problem has the form 

-s"(0 = o, ee(0,i), (16) 

with boundary conditions 

s(0) = 0. s(l) = - (1) . 1 ^^ = - (J) ■ |w(l). (17) 
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Once the transformed equations are constructed, a discretization is ap- 
plied. For the numerical approximations presented here, we apply a 
piecewise linear finite element method to (8)-(9) and to (16)-(17). For 
the sake of brevity, the details of the implementation are omitted; how- 
ever, the interested reader can refer to [3] for a more complete exposi- 
tion. Once the numerical calculations are performed, the recovery of the 
state and sensitivity approximations (defined on the physical space Qq) 
is achieved through the relations in (7) and (15). 

4. Computational Approach 2 

In this section, we present an approach to the optimal design problem 
which is similar to an idea considered in [5]. Like the previous scheme, 
both domain transformations and CSEMs are used in this strategy. The 
fundamental difference between the following approach and the one pre- 
sented in Section 3 is the order in which these techniques are applied. In 
this section, the domain transformation is applied to the cost function 
as well as the state equation. First, we construct a transformed optimal 
design problem which is equivalent to the original in (4) and which uses 
information from the transformed state equation. A CSEM is then used 
to supply gradient information for the transformed cost function. 

Before presenting the transformed optimal design problem, we remark 
that under the mapping M defined in (6), the following equality holds 

/ g{x)dx-=- g{M{^,q)) — dC = q g{M{^,q))d^, 

Jo JM(0;q) CL^ Jo 

where g is any C^-function defined on fii. Along with the previous 
equality, the definitions in (6) and (7) give rise to the transformed cost 
function ^ 

F{(l) = [w(^,^) +sin(97rO]^c?6 (18) 

Here w(^,g) is the solution to the transformed state equation given 
by the boundary value problem (8)- (9) for each q e Q. Hence, the 
transformed optimal design problem is given by 

mm F{q), (19) 

qeQ 

where the design space, Q, remains the same as in Section 2. Observe 
that a factor of q appears in the expression (18). Recall that the map- 
ping, M, depends explicitly on the parameter q. Hence, the absolute 
value of the derivative of the mapping (and more generally, the determi- 
nant of the Jacobian matrix) is also parameter dependent, and this term 
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appears explicitly in (18). For this particular example, the derivative is 
very simple, but we remark that the issue of parameter dependent deriva- 
tives is more complicated for two-dimensional and three-dimensional do- 
mains. A two-dimensional illustration can be found on pages 365-366 of 

[4]. 

Once the transformed optimal design problem is constructed, we pro- 
ceed in much the same manner as previously discussed. Using Leibnitz’ 
formula, the gradient of the transformed cost function has the following 
form 



^F{q) = \ If q) + sin {qnOf + (|) [w(^;?)]^ (-^) 

+ (l) 2[w(^; q) + sin [qn^] cos (^tt^) 

which can be simplified to the expression 

= q ['’ [w(^, q) + sin (^O] [p(^, 9 ) + cos {qTr^)]d^ 

Jo 

+ T(2F{q)-[^{^-,q)Yy ( 20 ) 

In the equation above, the notation p(^; q) is used to denote the sensi- 
tivity of the transformed state] that is, we define 

p(^;9) = (21) 

It is important to note that the sensitivity of the transformed state, 
p(^; g), is related to, but not the same function as, the transformed sen- 
sitivity^ s{(]q). The notation used above refiects this important distinc- 
tion. The following section describes the techniques used for obtaining 
numerical approximations for the transformed state and the sensitivity, 
p(^;9)- 



4.1 state and Sensitivity Calculations 

For this approach, the optimization algorithm requires that we com- 
pute a numerical approximation to the transformed state, w(-;g^), and 
the sensitivity of the transformed state, p(-;^)- As in the previous sec- 
tion, a piecewise linear finite element method is used to approximate the 
transformed state w(^,g). 

In order to calculate an approximation for p(-; g), we derive a sensitiv- 
ity equation for which p(-; is a solution. In particular, the transformed 
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state equation in (8)-(9) is “differentiated”. Although the parameter ap- 
pears explicitly in the right hand side of equation (8) and determines a 
point of discontinuity of the forcing function, one can still derive the sen- 
sitivity equation in a mathematically precise fashion. A rigorous math- 
ematical construction is presented in [6] and references therein. Here we 
simply state that the sensitivity, p(-; satisfies the second order, linear 
elliptic boundary value problem given by 

- (5i(C), (22) 

p(0) = 0, p(l) = 0. (23) 

Here 5i(£) is the Dirac delta function with mass at ^ i. Since the 

domain does not depend on g, the boundary conditions are clear. 
Observe that the sensitivity equation is decoupled from the transformed 
state equation, but we caution the reader that this decoupling is merely 
a phenomena of the linearity of the transformed state equation. We also 
note that the linear elliptic problem (22)- (23) does not have a solution in 
i^o (^ 1)7 the system must be interpreted in the weak sense, that is, in 
integral form. For the results presented in this paper, a piecewise linear 
finite element method is used to approximate both w(- ;g) and p(- ;g). 
For the sake of brevity, the details of the finite element implementations 
are omitted, and we proceed directly to the computational results. 

5. Computational Results 

In this section, numerical results are presented for two cases. The first 
is a comparison using a four-point Gauss quadrature rule for both the 
cost function approximations and the gradient approximations. From 
the second we make an interesting anecdotal comment concerning the 
importance of choosing a quadrature rule with the appropriate degree 
of accuracy. 

Recall that each computational approach involves discretizing and nu- 
merically computing an approximation to the transformed state equation 
(8)- (9). The distinction between the calculations is the fact that Com- 
putational Approach 1 recovers an approximation to the original state 
through the mapping, M, and implements the quadrature rule in the 
physical space while Computational Approach 2 applies the quadrature 
rule in the computational space. Since M is a straightforward algebraic 
manipulation which can be “hard- wired” , there is no loss in accuracy for 
the state approximation during the recovery process of Computational 
Algorithm 1. We briefly note that a four-point quadrature rule is suf- 
ficient to obtain an extremely accurate approximation to the true cost 
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function in each case. Figures 1 and 2 show the respective cost function 
approximations plotted against the graph of the true cost function. The 
step in the parameter is Ag = 0.1 over the parameter range given, and 
the transformed state approximations are obtained using N = 3 grid 
points for these graphs. We also note that the error (measured in the 
vector norm, || • ||oo) in the cost function approximations is on the order 
of 10“^ for each of the computational algorithms. Now we move to the 
more interesting issue of gradient approximations. 



Figure 1. True Cost Function and Approximations for Computational Approach 1 




5.1 Gradient Approximations 

This section briefly addresses the issue of gradient approximations for 
each computational approach. We preface the numerical results with two 
deflnitions regarding gradient approximations. The following discussion 
and deflnitions are taken from [2, 1]. We remark that the notation in [1] 
is slightly diff’erent because they explore the issue of applying different 
approximation schemes to obtain the state and sensitivity approxima- 
tions. For our results, the discretization applied to compute the state 
approximations, and subsequently the cost function approximations, is 
exactly the same as that applied to compute the sensitivity approxima- 
tions and the subsequent gradient approximations. 
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Figure 2. True Cost Function and Approximations for Computational Approach 2 




In the following discussion, we also refer to the discretization as an 
iV-discretization in the sense that N refers to the number of grid points 
in the finite element mesh. To be more precise, we should include nota- 
tion identifying the quadrature rule here as well. However, since we are 
comparing approximations using a four point quadrature rule in both 
cases, we choose to simplify the notation as much as possible. Further- 
more, J denotes an arbitrary cost function which depends on the design 
parameter q. A sensitivity approach is said to produce consistent deriva- 
tives with respect to the state and sensitivity approximations using the 
A’-discretization if 



= [VJ(?)]'^ \/q€Q. (24) 

This definition states that the gradient of the approximate cost function 
is the same as the approximation of the true gradient. A more relaxed 
definition stipulates that the difference between the two gradient ap- 
proximations should approach 0 with grid refinement. In particular, a 
sensitivity approach is said to produce asymptotically consistent deriva- 
tives with respect to the state and sensitivity approximations using the 
A-discretization if 



VJ^{q) - [VJCg)]"^ 



— > 0 ^ q ^ 



(25) 
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Figure 3. True Gradient and Approximations for Computational Approach 1 




as A/ -A oo, that is, as the grid is refined. The computational approaches 
presented in Sections 3 and 4 fall into the category of an approxima- 
tion of the true gradient. Hence, Computational Approach 1 produces 

[VF(^)]^, and Computational Approach 2 yields |^VF(g)j . 

In the following figures, we present a sample of the gradient approx- 
imations obtained using each computational approach. The gradient 
approximations are compared with both a centered difference gradient 
approximation (solid curve with o’s, representing and VF^, re- 

spectively) and the true gradient (solid curve). In Figure 3, the gradient 
approximations generated using Computational Approach 1 converge to 
the finite difference gradient (and to the true gradient) with mesh re- 
finement. Hence, Computational Approach 1 yields asymptotically con- 
sistent derivatives. Figure 4 indicates that Computational Approach 2 
produces consistent derivatives. 

5.2 Anecdotal Observation 

In the case where a three-point Gauss quadrature rule is used, the 
quadrature rule is insufficient for convergence of the cost function ap- 
proximations as the mesh is refined. That is, if we use three quadrature 
points for the integral approximations, then the cost function approxima- 
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Figure 4- True Gradient and Approximations for Computational Approach 2 




1 1.5 2 2.5 3 3.5 4 4.5 

q 



tions for each approach are given in Figures 5 and 6. We have included 
the graphs for W = 3 grid points and W = 33 grid points to show that 
the accuracy of the approximations does not improve with mesh refine- 
ment, and the error (using || •Hoc) in these approximations is on the order 
of 10~^. All of the approximations generated using values of N between 
3 and 33 exhibit exactly the same behavior. The gradient approxima- 
tions for this case are somewhat interesting. In particular, Figures 7 
and 8 suggest that Computational Algorithm 1 produces asymptotically 
consistent gradients while Computational Approach 2 produces incon- 
sistent or “non-consistent” gradient approximations. This behavior may 
be a result of the fact that the we use the transformed cost function ap- 
proximation during the gradient calculation on the computational space, 
recall the expression in (20). The gradient expression for Computational 
Algorithm 1, in (11), does not explicitly involve the cost function, F{q). 



6. Computational Issues 

We conclude with some observations gathered during the course of the 
research. Since the domain transformations depend explicitly on the pa- 
rameter, spatial derivatives are also parameter dependent and appear ex- 
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Figure 5. True Cost Function and Approximations for Computational Approach 1 
using three-point quadrature rule 




Figure 6. True Cost Function and Approximations for Computational Approach 2 
using three-point quadrature rule 




q 
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Figure 7. True gradient, finite difference gradient and approximations for Compu- 
tational Approach 1 using three-point quadrature rule 
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Figure 8. True gradient, finite difference gradient and approximations for Compu- 
tational Approach 2 using three-point quadrature rule 
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plicitly in both the transformed cost function and the transformed state 
equation. As a result, the derivation of a gradient expression is tedious 
and involves several terms including the transformed cost function, F{q). 
We approximate F{q) at each iteration of the optimization algorithm, 
and the approximation is reused in the gradient approximation routine. 
However, this may require good judgement for the quadrature rules as 
shown in Section 5.2. The results given here indicate that CSEMs can 
yield accurate, consistent gradients provided that the numerical schemes 
are chosen with care. Using the domain transformations is advantageous 
for the rigorous mathematical derivation of sensitivity equations. How- 
ever, the issue of differentiability of the mappings becomes an important 
question for both gradient derivation and sensitivity analysis in Compu- 
tational Approach 2 for problems with complicated geometries. 

In Computational Approach 1, the derivation of the sensitivity equa- 
tion is somewhat ad hoc; however, differentiation of the domain mapping 
is not required. One must also be willing to accept the asymptotically 
consistent derivatives that this method produces. For many problems, 
we observe that the gradient approximations for this approach tend to 
accurately pinpoint the location of the root of the gradient even on coarse 
meshes. Finally, the results given in Section 5.2 indicate that for certain 
problems, CSEMs can produce asymptotically consistent gradients even 
if the cost function approximations are inaccurate. Each computational 
algorithm exhibits specific characteristics that can be view as advan- 
tageous. Further research to determine which computational approach 
best fits a given problem is needed. 
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Abstract For computational purposes such as debugging, derivative computations 
using the reverse mode of automatic differentiation, or optimal control 
by Newton’s method, one may need to reverse the execution of a pro- 
gram. The simplest option is to record a complete execution log and 
then to read it backwards. As a result, massive amounts of storage axe 
normally required. This paper proposes a new approach to reversing 
program executions. The presented technique runs the forward simu- 
lation and the reversal process at the same speed. For that purpose, 
one only employs a fixed and usually small amount of memory pads 
called checkpoints to store intermediate states and a certain number 
of processors. The execution log is generated piecewise by restarting 
the evaluation repeatedly and concurrently from suitably placed check- 
points. The paper illustrates the principle structure of time-minimal 
parallel reversal schedules and quotes the required resources. Further- 
more, some specific aspects of adjoint calculations are discussed. Initial 
results for the steering of a Formula 1 car are shown. 

KeyAvords: Adjoint calculation. Checkpointing, Parallel computing 

1. Introduction and Notation 

For many industrial applications, rather complex interactions between 
various components have been successfully simulated with computer 



317 




318 



models. This is true for several production processes, e.g. steel ma- 
nufacturing with regards to various product properties, for example 
stress distribution. However, the simulation stage can frequently not 
be followed by an optimization stage, which would be very desirable. 
This situation is very often caused by the lack or inaccuracy of deriva- 
tives, which are needed in optimization algorithms. Hence, enabling 
the transition from simulation to optimization represents a challenging 
research task. 

The technique of algorithmic or automatic differentiation (AD), which 
is not yet well enough known, offers an opportunity to provide the 
required derivative information [5]. Therefore, AD can contribute to 
overcoming the step from pure simulation and hence ‘‘trial and error” - 
improvements to an exact analysis and systematic derivative-based op- 
timization. 

The key idea of algorithmic differentiation is the systematic applica- 
tion of the chain rule. The mathematical specification of many applica- 
tions involves nonlinear vector functions 

F:R^ x^F{x), 

that are typically defined and evaluated by computer programs. This 
computation can be decomposed into a (normally large) number of very 
simple operations, e.g. additions, multiplications, and trigonometric or 
exponential function evaluations. The derivatives of these elementary 
operations can be easily calculated with respect to their arguments. A 
systematic application of the chain rule yields the derivatives of a hier- 
archy of intermediate values. Depending on the starting point of this 
methodology, either at the beginning or at the end of the sequence of 
operations considered, one distinguishes between the forward mode and 
the reverse mode of AD. The reverse mode of algorithmic differentiation 
is a discrete analog of the adjoint method known from the calculus of 
variations. 

The gradient of a scalar- valued function is yielded by the reverse mode 
in its basic form for no more than five times the operations count of 
evaluating the function itself. This bound is completely independent 
of the number of independent variables. More generally, this mode al- 
lows the computation of Jacobians for at most five times the number 
of dependents times the effort of evaluating the underlying vector func- 
tion. However, the spatial complexity of the basic reverse mode, i.e. its 
memory requirement, is proportional to the temporal complexity of the 
evaluation of the function itself. This behaviour is caused by the fact 
that one has to record a complete execution log onto a data structure 
called tape and subsequently read this tape backward. For each arith- 
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metic operation, the execution log contains a code and the addresses of 
the arguments as well as the computed value. It follows that the practi- 
cal exploitation of the advantageous temporal complexity bound for the 
reverse mode is severely limited by the amount of memory required. 

The reversal of a given function F is already being extensively used 
to calculate hand-coded adjoints. In particular, there are several con- 
tributions on weather data assimilation (e.g. [11]). Here, the desired 
gradients can be obtained with a low temporal complexity by integrat- 
ing the linear co-state equation backwards along the trajectory of the 
original simulation. This well-known technique is closely related to the 
reverse mode of AD [3]. Moreover, debugging and interactive control 
may require the reconstruction of previous states by some form of run- 
ning the program backwards that evaluates F. The need for some kind 
of logging arises whenever the process described by F is not invertible 
or ill conditioned. In these cases one cannot simply apply an inverse 
process to evaluate the inverse mapping F~^. Consequently, the rever- 
sal of a program execution within a reasonable memory requirement has 
received some (but only perfunctory) attention in the computer science 
literature (see e.g. [12]). 

This paper presents a new approach to reversing the calculation of F. 
For that reason, in the remainder of this section, the structure of the 
function F is described in detail. The reversal technique proposed in 
this article only employs a fixed and usually small amount of memory 
pads to store intermediate states and a certain number of processors for 
reversing F in minimal time. The corresponding time-minimal parallel 
reversal schedules are introduced in Section 2. The simulation of a For- 
mula 1 car is considered in Section 3. The underlying ODE system is 
introduced. Then two diflFerent ways to calculate adjoints are discussed. 
Subsequently, the initial numerical results are presented. Finally, some 
conclusions are drawn in Section 4. 

Throughout it is assumed that the evaluation of F comprises the 
evaluation of subfunctions 1 < « < /, called physical steps that 
act on state x'^~^ to calculate the subsequent intermediate state x'^ for 
I <i < I depending on a control Hence, one has 

= Fi{x^-\u^-^) . 

Therefore, F can be thought of as a discrete evolution. The intermediate 
states of the evolution F represented by the counter i should be thought 
of as vectors of large dimensions. The physical steps Fi describe mathe- 
matical mappings that in general cannot be reversed at a reasonable cost 
even for given Hence, it is impossible to simply apply the inverses 

F^^ in order to run the program backwards from state I to state 0. It 
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will also be assumed that due to their size, only a limited number of 
intermediate states can be kept in memory. 

Furthermore, it is supposed that for each i G there exist 

functions Fi that cause the recording of intermediate values generated 
during the evaluation of Fi onto the tape and corresponding functions Fi 
that perform the reversal of the ith physical step using this tape. More 
precisely, one has the reverse steps 

where F/ denotes the Jacobian of Fi with respect to x'^~^ and u^~^. 
The calculation of adjoints using the basic approach is depicted in Fig- 
ure 1. Applying a checkpointing technique, the execution log is gen- 




Figure 1. Naive approach to calculate Adjoints 



erated piecewise by restarting the evaluation repeatedly from suitably 
placed checkpoints, according to requests by the reversal process. Here, 
the checkpoints can be thought of as pointers to nodes representing in- 
termediate states i. Using a checkpointing strategy on a uni-processor 
machine, the calculation of F can be reversed, even in such cases where 
the basic reverse mode fails due to excessive memory requirement (see 
e.g. [7, 6]). However, the runtime for the reversal process increases com- 
pared to the naive approach. For multi-processor machines, this paper 
presents a checkpointing technique with concurrent recalculations that 
reverses the program execution in minimal wall-clock time. 

2. Time-minimal Parallel Reversal Schedules 

To derive an optimal reversal of the evaluation procedure JP, one has 
to take into account four kinds of parameters, namely: 
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1. ) the number I of physical steps to be reversed; 

2. ) the number p of processors that are available; 

3. ) the number c of checkpoints that can be accommodated; and 

4. ) the step costs: r = TIME{Fi)^ f = TIME{Fi)^ f = TIME(Fi). 

Well known reversal schedules for serial machines, i.e. p = 1, and con- 
stant step costs r allow an enormous reduction of the memory required 
to reverse a given evolution F in comparison with the basic approach 
(see e.g. [7, 6]). Even if the step costs = TIME{Fi) are not constant 
it is possible to compute optimal serial reversal schedules [13]. However, 
one has to pay for the improvements in the form of a greater temporal 
complexity because of repeated forward integrations. 

If no increase in the time needed to reverse F is acceptable, the use 
of a sufficiently large number of additional processors provides the pos- 
sibility to reverse the evolutionary system F with drastically reduced 
spatial complexity and still minimal temporal complexity. Correspond- 
ing parallel reversal schedules that are optimal for given numbers I of 
physical steps, p > 1 processors, c checkpoints, and constant step costs 
were presented for the first time in [13]. For that purpose, it is sup- 
posed that r =: 1, f > 1, and r > 1, with f,r G N. Furthermore, it is 
always assumed that the memory requirement for storing the interme- 
diate states is the same for all i. Otherwise, it is not clear whether and 
how parallel reversal schedules can be constructed and optimized. The 
techniques developed in [13] can certainly not be applied. In practical 
applications, nonuniform state sizes might arise, for example as result of 
adaptive grid refinements, or function evaluations that do not conform 
naturally to our notion of an evolutionary system on a state space of 
fixed dimension. 

Finding a time-minimal parallel reversal schedule can be interpreted 
as a very special kind of scheduling problem. The general problem class 
is known to be NP-hard (e.g. [4]). Nevertheless, it is possible to specify 
suitable time-minimal parallel reversal schedules for a arbitrary number 
I of physical steps because the reversal of a program execution has a 
very special structure. For the development of these time-minimal and 
resource-optimal parallel reversal schedules, first an exhaustive search 
algorithm was written. The input parameters were the number p of 
available processors and the number c of available checkpoints with both 
r and r set to 1. The program then computed a schedule that reverses 
the maximal number of physical steps /(p, c) in minimal time using no 
more than the available resources p and c for p + c < 10. Here, minimal 
time means the wall clock equivalent to the basic approach of recording 
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all needed intermediate results. Examining the corresponding parallel 
reversal schedules, one obtained that for p > c, only the resource number 
g = p + c has an influence on /(p, c) = I g. Therefore, the development of 
time-minimal parallel reversal schedules, that are also resource-optimal, 
is focused on a given resource number g under the tacit assumption 
p > c. The results obtained for p < 10 provided sufficient insight to 
deduce the general structure of time-minimal parallel reversal schedules 
for arbitrary combinations of f > 1, f > 1, and g > 10. Neglecting 
communication cost, the following recurrence is established in [13]: 

Theorem: Given the number of available resources g = p-\-c with p > c 
and the temporal complexities f E N and f E N o/ the recording steps Fi 
and the reverse steps Fi, then the maximal length of an evolution that 
can be reverted in parallel without interruption is given by 



, _ j Q if q<2 + t/t . , 

^ \ Ig-i + T lg -2 - T + l else. 

In order to prove this result, first an upper bound on the number of 
physical steps that can be reversed with a given number g of processors 
and checkpoints was established. Subsequently, corresponding rever- 
sal schedules that attain this upper bound were constructed recursively. 
For this purpose, the resource profiles of the constructed parallel reversal 
schedules were analyzed in detail. In addition to the recursive construc- 
tion of the desired time- minimal reversal schedules, the resource profiles 
yield an upper bound for the number p of processors needed during the 
reversal process. To be more precise, for reversing Ig physical steps, one 
needs no more than 
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if r > f 
else 



processors [13]. Hence, roughly half of the resources have to be proces- 
sors. This fact offers the opportunity to assign one checkpoint to each 
processor. 

A time-minimal reversal schedule for / = 55 is depicted in Figure 2. 
Here, vertical bars represent checkpoints and slanted bars represent run- 
ning processes. The shading indicates the physical steps Fi, the record- 
ing steps Fi and the reverse steps Fi to be performed. 

Based on the recurrence (1), it is possible to describe the behaviour 
of Ig more precisely. For f = f = 1, one finds that the formula for Ig is 
equal to the Fibonacci- number fg-\. Moreover, for other combinations 
of f,f E N, the recurrence (1) produces generalized Fibonacci-numbers 
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Figure 2. Time-minimal Parallel Reversal Schedule for / = 21 and r = r = 1 
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(see e.g. [9]). More specifically, one finds that 



1 + 



vl + dr 



-(l + ^/^T4f) 



^-1 



in the sense that the ratio between the two sides tends to 1 as ^ tends 
to infinity. In the important case f = 1 even their absolute difference 
tends to zero. Thus, I = Ig grows exponentially as a function oi g ^ 2p 
and conversely p ^ c grows logarithmically as a function of /. In order to 
illustrate the growth of assume 16 processors and 16 checkpoints are 
available. These resources suffice to reverse an evolution of / = 2 178 309 
physical steps when r = f = 1 and even more steps if f = 1 and f > 1. 

For f = 1, i.e., if the forward simulation and the reversal of the time 
steps can be performed at the same speed, the implementation of this 
theory was done using the distributed memory programming model [10]. 
It is therefore possible to run the parallel reversal schedules framework 
on most parallel computers independent of their actual memory struc- 
ture. To achieve a flexible implementation, the MPI routines for the 
communication are used. The parallel reversal schedules are worked off 
in a process-oriented manner instead of a checkpoint-oriented manner 
(see [10] for details). This yield the optimal resource requirements of 
Theorem 1. 

In order to apply the parallel reversal schedules framework, one has 
to provide interfaces and define the main data structures for computing 
the adjoint. The data structures required are the checkpoints, the traces 
or tapes, as a result of the recording step and the adjoint values. The 
structure and complexity of this data is independent of the framework 
since the framework only calls routines such as 

■ forward (..) for the evaluation of one physical step 

■ record ing(..) for the evaluation of one recording step 

■ reverse(..) for the evaluation of one reverse step F*, 

provided by the user. These functions are equivalent to the functions 
used for a sequential calculation of the adjoint. The index i is an ar- 
gument of each of the modules. The function recording(..) generates 
the trace or tape. The function reverse(..) obtains the trace of the last 
recording step and the adjoint computed so far as arguments. Further- 
more, if i = the function reverse(..) may initialize the adjoints. 

Additionally, the user must code communication modules, for example 
sendCheckpoint(..) and receiveCheckpoint(..). All user-defined routines 
have to be implemented applying MPI routines. The required process 
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identifications and message tags are arguments of routines provided by 
the parallel reversal schedules framework. 



3. Model Problem: Steering a Formula 1 Car 

In order to test the implementation of parallel reversal schedules, the 
simulation of an automobile is considered. The aim is to minimize the 
time needed to travel along a specific road. A simplified model of a 
Formula 1 racing car [1] is employed. It is given by the ODE system: 



xi 

X2 

^3 

X4 

±5 

Xq 

X7 



= X2 

^ (■P’m (x,U2) + Fjj2(x,U2))lf - (F^3 (x,U 2) + F^^(x,U2))lr 

I 

Fr]l (^5 ^2) ^ T }2 (^1 ^2) “ 1 “ (^5 ^2) “i” -^774 (^5 ^2)) 

= M 

= X4 sin( 3 ::i) + xs cos(a;i) 

== X4^ cos(rri) — xs sin(o;i) 

= ui. 



Hence, a go-kart model with rigid suspension and a body rolling about a 
fixed axis is considered. There are seven state variables representing the 
yaw angle and rate {x\^ X 2 )^ the lateral and longitudinal velocity (rrs, 
0:4), global position (0:5, xe)^ and the vehicle steer angle (xj) as shown in 
Figure 3. The control variables are ui denoting the front steer rate and 
U 2 denoting the longitudinal force as input. The lateral and longitudinal 
vehicle forces Frj and are computed using the state and the control 
variables as well as the tire forces given by a tire model described in [2]. 
The force Fa represents the aerodynamic drag depending on the longi- 
tudinal velocity. All other values are fixed car parameters such as mass 
M and length of the car given by If and 

In order to judge the quality of the driven line, the cost functional 

J{si)=f Scf{x,s){l+ g{x,s))ds (2) 

JO 

is used. The scaling factor Scf{x^ s) changes the original time integration 
within the cost function to distance integration. Therefore, an integra- 
tion over the arc length is performed. This variable change has to be 
done because the end time ti of the time integration is the value one 
actually wants to minimise. Hence, ti is unknown. The computation of 
the scaling factor Scf{x^ s) is described in [1]. The function g{x, s) mea- 
sures whether or not the car is still on the road. The road is defined by 
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Figure 8. Model of Formula 1 Car. 



the road centre line and road width. In the example presented here, the 
road width is constant a 2.5 m along the whole integration path. The 
function g{x^s) returns zero as long as the car drives within the road 
boundaries. If the car leaves the road then g{x^ 5 ) returns the distance 
from the car to the road boundary squared. 

3.1. The Forward Integration 

For the numerical results presented here, a discretization has to be 
applied. Therefore, an appropriate initial vector x^ and the starting 
position = 0 were chosen. The route is divided equidistantly with a 
step size of h = 10 cm. The well known four-stage Runge-Kutta scheme 

ki = 

^2 = f{x'^~^ + hki/2,u{s'^~^+h/2)) 

ks = f{x^~^ -h hk2/2, + /i/2)) (3) 

/j4 = f{x'^~^ + hkz^u{s^~^ + h)) 

x^ — x^ ^ + h{k\ + 2k2 “h 2/^3 + k/^jO 

serves as physical step for i = 1, . . . , 1000. 

The calculations of a physical step Fi form the forward (..)-routine 
needed by the time-minimal parallel reversal schedules. As mentioned 
above, in addition to this, one has to provide two further routines. 
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namely record ing(..) and reverse(..). The content of these two modules 
is described in the next subsection. 



3.2. Calculating Adjoints 



There are two basic alternatives for calculating the adjoints of a given 
model. Firstly, one may form the adjoint of the continuous model equa- 
tion and discretize the continuous adjoint equation. Secondly, one may 
use automatic differentiation (AD), or hand-coding, to adjoin the dis- 
crete evaluation procedure of the model. Both ways do not commute 
in general (see e.g. [8]). Therefore, one has to be careful when decid- 
ing how to calculate the desired adjoints. For the computations shown 
below, the second option was applied, namely the adjoining of the dis- 
cretized equation (3). Application of AD in reverse mode amounts to 
the following adjoint calculation Fi (see e.g. [5]): 




tt3 = hx^ jZ -F /i&4 
U2 = hxj jZ + /i6s/2 
a\ — hx^ j?i + /i 62/2 
uj — 



k- - —k- 

64 = tt4A;4 
63 = asks 

62 = a2^2 
b\ = a\ki 
— a\ki 






dJ 

dx'^~^ 



+ + 61 + 62 + f>3 + 64, 



1 < j < 4 



( 4 ) 



for i = where the functions kj^ 1 < j < 4, are defined as 

in (3). Here, iP denotes the adjoint of the control u a>t s^. Note that the 
integration of the adjoint scheme (4) has to be performed in reverse order 
starting at i = /. One uses x\ = dJ/dx\ 1 < i <7 and u\ = 0^ i = 1^2 
as initial values because of the influence on the cost functional (2). After 
the complete adjoint calculation, each value denotes the sensitivity of 
the cost functional J with respect to the value ui. 

Now the return value of the routine reverse(..) is clear. It has to 
contain the computations needed to perform an adjoint step Fi according 
to (4). However, there are two ways to implement the interface between 
the modules recording(..) and reverse(..). One can either store the stages 
1 <i <4, during the evaluation of the recording step Fi. Then the 
corresponding reverse step Fi comprises all calculations shown in (4), i.e. 
also the computation of the Jacobians kj^ 1 < j < 4. As an alternative, 
one can compute the Jacobians fej, 1 < j < 4 in the recording step Fi 
and store this information on the tape. Then the appropriate reverse 
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step Fi only has to evaluate the last three statements of Equation (4). 
The runtimes represented here are based on the second approach in order 
to achieve f = \. As a result, f equals 5. This implementation has the 
advantage that the value of f and hence the wall clock time are reduced 
at the expense of f. This can be seen for example in Figure 2, where an 
increase of f would result in an bigger slope of the bar describing the 
adjoint or reverse computations. 

As mentioned above, one has to be careful about the adjoint calcu- 
lation because of the lack of commutativity between adjoining and dis- 
cretizing in general. Therefore, it is important to note that the Runge- 
Kutta scheme (3) belongs to a class of discretizations, for which both 
possibilities of adjoint calculation coincide, giving the same result [8]. 

3.3. Numerical Results 

To test the parallel reversal schedule framework, one forward inte- 
gration of the car model shown in Figure 4 and one adjoint calculation 
were performed. As previously mentioned, the integration distance was 
100 m and the step size 10 cm. Hence, there are 1000 forward steps 
Fi. The Figure 5(a) shows the growth of the cost functional for which 




longitudinal position 



Figure Position of Formula 1 Car. 

we computed the sensitivities of the control variables u\ (Figure 5(b)) 
and u\ (Figure 5(c)). However, the resource requirements are of primary 
interest. One integration step in the example is relatively small in terms 
of computing time. In order to achieve reasonable timings 18 integration 
steps form one physical step of the parallel reversal schedule. The re- 
maining 10 integration steps were spread uniformly. Hence, one obtains 
55 physical steps. Therefore, five processors were needed for the corre- 
sponding time- minimal parallel reversal schedule for f = f == 1. This 
reversal schedule is with small modifications also nearly optimal for the 
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(a) Cost Functional J{s). 




(b) Adjoint of steering rate u\. 




(c) Adjoint of longitudinal force U 2 - 



Figure 5. Cost Functional and Adjoint of Control Variables. 



considered combination r = 5 and r = 1. A sixth processor (master) 
was used to organise the program run. 





naive approach 


parallel checkpointing 


double variables needed 


266010 


5092 


memory 

required 


in kByte 


2128.1 


40.7 


in % 


100.0 


1.9 



Table 1. Memory Requirement 



The main advantage of the parallel reversal schedules is the enormous 
reduction in memory requirement as illustrate in Table 1. It shows that 
for this example, less than a fiftieth of the original memory requirement 
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is needed, i.e., less than 2%. On the other hand, only six times the 
original computing power, i.e., processors, is used. 

The theoretical runtime is also confirmed by the example as can be 
seen in Table 2. Due to the slower memory interface on a Cray T3E, the 
usage of less memory in parallel causes an enormous decrease in runtime. 
On the other hand the problem is too small and the SGI Origin 3800 too 
fast to show this effect. Nevertheless, one obtains that the assumption of 
negligible communication cost is reasonable. This is caused by the fact 
that the processors have the duration of one full physical step to send 
and receive a checkpoint because the checkpoint is not needed earlier. 
Only if the send and receive of one checkpoint needs more time than one 
physical step the communication cost becomes critical. 





naive approach 


parallel checkpointing 


T3E 


in sec. 


20.27 


18.91 


in% 


100.0 


93.3 


Origin 3800 


in sec. 


6.71 


6.04 


in % 


100.0 


90.0 



Table 2. Runtime results 



4. Conclusions 

The potentially enormous memory requirement of program reversal 
by complete logging often causes problems despite the ever increasing 
size of memory systems. This paper proposes an alternative method, 
where the memory requirement can be drastically reduced by keeping at 
most c intermediate states as checkpoints. In order to avoid an increase 
in runtime, p processors are used to reverse evolutions with minimal wall 
clock time. For the presented time-minimal parallel reversal schedules, 
the number I of physical steps that can be reversed grows exponentially 
as a function of the resource number g = c + p. A corresponding soft- 
ware tool has been coded using MPI. Initial numerical tests are reported. 
They confirm the enormous reduction in memory requirement. Further- 
more, the runtime behaviour is studied. It is verified that the wall clock 
time of the computation can be reduced compared to the logging-all ap- 
proach if the memory access is comparatively costly. This fact is caused 
by the reduced storage in use. If the memory access is comparatively 
cheap, the theoretical runtime of time-minimal parallel reversal sched- 
ules is also confirmed. 

The following overall conclusion can be drawn. For adjoining sim- 
ulations log^(^)(# physical steps) processors and checkpoints are wall 
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clock equivalent to 1 processor and (# physical steps) checkpoints with 
a(f) = ^(1 + and f the temporal complexity of a reverse step. 
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